[2/2] spark git commit: Preparing development version 1.4.0-SNAPSHOT

2015-05-29 Thread pwendell
Preparing development version 1.4.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6bf5a420
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6bf5a420
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6bf5a420

Branch: refs/heads/branch-1.4
Commit: 6bf5a42084d5f5c601d3c41358a12bddeedb
Parents: f279681
Author: Patrick Wendell pwend...@gmail.com
Authored: Thu May 28 23:40:27 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Thu May 28 23:40:27 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index b8a821d..626c857 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index c1aa32b..1f3dec9 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a9b8b42..e58efe4 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 38ff67d..e4efee7 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index e8784eb..1f3e619 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6bf5a420/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 1794f3e..8df7edb 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+

Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [created] f2796816b

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[1/2] spark git commit: Preparing Spark release v1.4.0-rc3

2015-05-29 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 55dc7a693 - 6bf5a4208


Preparing Spark release v1.4.0-rc3


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f2796816
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f2796816
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f2796816

Branch: refs/heads/branch-1.4
Commit: f2796816bea12a7894519c6882b73f0ef5b99b14
Parents: 55dc7a6
Author: Patrick Wendell pwend...@gmail.com
Authored: Thu May 28 23:40:22 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Thu May 28 23:40:22 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 626c857..b8a821d 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index 1f3dec9..c1aa32b 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index e58efe4..a9b8b42 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index e4efee7..38ff67d 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 1f3e619..e8784eb 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f2796816/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 8df7edb..1794f3e 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 

spark git commit: [SPARK-7912] [SPARK-7921] [MLLIB] Update OneHotEncoder to handle ML attributes and change includeFirst to dropLast

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 6bf5a4208 - 509a7cafc


[SPARK-7912] [SPARK-7921] [MLLIB] Update OneHotEncoder to handle ML attributes 
and change includeFirst to dropLast

This PR contains two major changes to `OneHotEncoder`:

1. more robust handling of ML attributes. If the input attribute is unknown, we 
look at the values to get the max category index
2. change `includeFirst` to `dropLast` and leave the default to `true`. There 
are couple benefits:

a. consistent with other tutorials of one-hot encoding (or dummy coding) 
(e.g., http://www.ats.ucla.edu/stat/mult_pkg/faq/general/dummy.htm)
b. keep the indices unmodified in the output vector. If we drop the first, 
all indices will be shifted by 1.
c. If users use `StringIndex`, the last element is the least frequent one.

Sorry for including two changes in one PR! I'll update the user guide in 
another PR.

jkbradley sryza

Author: Xiangrui Meng m...@databricks.com

Closes #6466 from mengxr/SPARK-7912 and squashes the following commits:

a280dca [Xiangrui Meng] fix tests
d8f234d [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
SPARK-7912
171b276 [Xiangrui Meng] mention the difference between our impl vs sklearn's
00dfd96 [Xiangrui Meng] update OneHotEncoder in Python
208ddad [Xiangrui Meng] update OneHotEncoder to handle ML attributes and change 
includeFirst to dropLast

(cherry picked from commit 23452be944463dae72a35b58551040556dd3aeb5)
Signed-off-by: Xiangrui Meng m...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/509a7caf
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/509a7caf
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/509a7caf

Branch: refs/heads/branch-1.4
Commit: 509a7cafccc7ce6a64a159a2647ed56e52ed5df9
Parents: 6bf5a42
Author: Xiangrui Meng m...@databricks.com
Authored: Fri May 29 00:51:12 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 00:51:24 2015 -0700

--
 .../apache/spark/ml/feature/OneHotEncoder.scala | 160 +--
 .../spark/ml/feature/OneHotEncoderSuite.scala   |  42 -
 python/pyspark/ml/feature.py|  58 ---
 3 files changed, 176 insertions(+), 84 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/509a7caf/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
index eb6ec49..8f34878 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
@@ -17,94 +17,152 @@
 
 package org.apache.spark.ml.feature
 
-import org.apache.spark.SparkException
 import org.apache.spark.annotation.Experimental
-import org.apache.spark.ml.UnaryTransformer
-import org.apache.spark.ml.attribute.{Attribute, BinaryAttribute, 
NominalAttribute}
+import org.apache.spark.ml.Transformer
+import org.apache.spark.ml.attribute._
 import org.apache.spark.ml.param._
 import org.apache.spark.ml.param.shared.{HasInputCol, HasOutputCol}
 import org.apache.spark.ml.util.{Identifiable, SchemaUtils}
-import org.apache.spark.mllib.linalg.{Vector, VectorUDT, Vectors}
-import org.apache.spark.sql.types.{DataType, DoubleType, StructType}
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.functions.{col, udf}
+import org.apache.spark.sql.types.{DoubleType, StructType}
 
 /**
  * :: Experimental ::
- * A one-hot encoder that maps a column of label indices to a column of binary 
vectors, with
- * at most a single one-value. By default, the binary vector has an element 
for each category, so
- * with 5 categories, an input value of 2.0 would map to an output vector of
- * (0.0, 0.0, 1.0, 0.0, 0.0). If includeFirst is set to false, the first 
category is omitted, so the
- * output vector for the previous example would be (0.0, 1.0, 0.0, 0.0) and an 
input value
- * of 0.0 would map to a vector of all zeros. Including the first category 
makes the vector columns
- * linearly dependent because they sum up to one.
+ * A one-hot encoder that maps a column of category indices to a column of 
binary vectors, with
+ * at most a single one-value per row that indicates the input category index.
+ * For example with 5 categories, an input value of 2.0 would map to an output 
vector of
+ * `[0.0, 0.0, 1.0, 0.0]`.
+ * The last category is not included by default (configurable via 
[[OneHotEncoder!.dropLast]]
+ * because it makes the vector entries sum up to one, and hence linearly 
dependent.
+ * So an input value of 

spark git commit: [SPARK-7912] [SPARK-7921] [MLLIB] Update OneHotEncoder to handle ML attributes and change includeFirst to dropLast

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/master 97a60cf75 - 23452be94


[SPARK-7912] [SPARK-7921] [MLLIB] Update OneHotEncoder to handle ML attributes 
and change includeFirst to dropLast

This PR contains two major changes to `OneHotEncoder`:

1. more robust handling of ML attributes. If the input attribute is unknown, we 
look at the values to get the max category index
2. change `includeFirst` to `dropLast` and leave the default to `true`. There 
are couple benefits:

a. consistent with other tutorials of one-hot encoding (or dummy coding) 
(e.g., http://www.ats.ucla.edu/stat/mult_pkg/faq/general/dummy.htm)
b. keep the indices unmodified in the output vector. If we drop the first, 
all indices will be shifted by 1.
c. If users use `StringIndex`, the last element is the least frequent one.

Sorry for including two changes in one PR! I'll update the user guide in 
another PR.

jkbradley sryza

Author: Xiangrui Meng m...@databricks.com

Closes #6466 from mengxr/SPARK-7912 and squashes the following commits:

a280dca [Xiangrui Meng] fix tests
d8f234d [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
SPARK-7912
171b276 [Xiangrui Meng] mention the difference between our impl vs sklearn's
00dfd96 [Xiangrui Meng] update OneHotEncoder in Python
208ddad [Xiangrui Meng] update OneHotEncoder to handle ML attributes and change 
includeFirst to dropLast


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/23452be9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/23452be9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/23452be9

Branch: refs/heads/master
Commit: 23452be944463dae72a35b58551040556dd3aeb5
Parents: 97a60cf
Author: Xiangrui Meng m...@databricks.com
Authored: Fri May 29 00:51:12 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 00:51:12 2015 -0700

--
 .../apache/spark/ml/feature/OneHotEncoder.scala | 160 +--
 .../spark/ml/feature/OneHotEncoderSuite.scala   |  42 -
 python/pyspark/ml/feature.py|  58 ---
 3 files changed, 176 insertions(+), 84 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/23452be9/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
index eb6ec49..8f34878 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala
@@ -17,94 +17,152 @@
 
 package org.apache.spark.ml.feature
 
-import org.apache.spark.SparkException
 import org.apache.spark.annotation.Experimental
-import org.apache.spark.ml.UnaryTransformer
-import org.apache.spark.ml.attribute.{Attribute, BinaryAttribute, 
NominalAttribute}
+import org.apache.spark.ml.Transformer
+import org.apache.spark.ml.attribute._
 import org.apache.spark.ml.param._
 import org.apache.spark.ml.param.shared.{HasInputCol, HasOutputCol}
 import org.apache.spark.ml.util.{Identifiable, SchemaUtils}
-import org.apache.spark.mllib.linalg.{Vector, VectorUDT, Vectors}
-import org.apache.spark.sql.types.{DataType, DoubleType, StructType}
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.functions.{col, udf}
+import org.apache.spark.sql.types.{DoubleType, StructType}
 
 /**
  * :: Experimental ::
- * A one-hot encoder that maps a column of label indices to a column of binary 
vectors, with
- * at most a single one-value. By default, the binary vector has an element 
for each category, so
- * with 5 categories, an input value of 2.0 would map to an output vector of
- * (0.0, 0.0, 1.0, 0.0, 0.0). If includeFirst is set to false, the first 
category is omitted, so the
- * output vector for the previous example would be (0.0, 1.0, 0.0, 0.0) and an 
input value
- * of 0.0 would map to a vector of all zeros. Including the first category 
makes the vector columns
- * linearly dependent because they sum up to one.
+ * A one-hot encoder that maps a column of category indices to a column of 
binary vectors, with
+ * at most a single one-value per row that indicates the input category index.
+ * For example with 5 categories, an input value of 2.0 would map to an output 
vector of
+ * `[0.0, 0.0, 1.0, 0.0]`.
+ * The last category is not included by default (configurable via 
[[OneHotEncoder!.dropLast]]
+ * because it makes the vector entries sum up to one, and hence linearly 
dependent.
+ * So an input value of 4.0 maps to `[0.0, 0.0, 0.0, 0.0]`.
+ * Note that this is different from scikit-learn's OneHotEncoder, which keeps 
all 

spark git commit: [SPARK-7929] Turn whitespace checker on for more token types.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 119c93af9 - 55dc7a693


[SPARK-7929] Turn whitespace checker on for more token types.

This is the last batch of changes to complete SPARK-7929.

Previous related PRs:
https://github.com/apache/spark/pull/6480
https://github.com/apache/spark/pull/6478
https://github.com/apache/spark/pull/6477
https://github.com/apache/spark/pull/6476
https://github.com/apache/spark/pull/6475
https://github.com/apache/spark/pull/6474
https://github.com/apache/spark/pull/6473

Author: Reynold Xin r...@databricks.com

Closes #6487 from rxin/whitespace-lint and squashes the following commits:

b33d43d [Reynold Xin] [SPARK-7929] Turn whitespace checker on for more token 
types.

(cherry picked from commit 97a60cf75d1fed654953eccedd04f3442389c5ca)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/55dc7a69
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/55dc7a69
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/55dc7a69

Branch: refs/heads/branch-1.4
Commit: 55dc7a693368ddbd850459034709e3dd751dbcf3
Parents: 119c93a
Author: Reynold Xin r...@databricks.com
Authored: Thu May 28 23:00:02 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Thu May 28 23:00:08 2015 -0700

--
 .../flume/sink/TransactionProcessor.scala |  2 +-
 .../spark/streaming/flume/EventTransformer.scala  |  2 +-
 .../spark/streaming/kafka/KafkaRDDSuite.scala |  2 +-
 .../spark/streaming/mqtt/MQTTInputDStream.scala   | 14 +-
 .../examples/streaming/KinesisWordCountASL.scala  |  2 +-
 .../spark/streaming/kinesis/KinesisUtils.scala|  4 ++--
 scalastyle-config.xml | 13 -
 .../spark/sql/hive/HiveInspectorSuite.scala   | 12 ++--
 .../spark/sql/hive/InsertIntoHiveTableSuite.scala |  2 +-
 .../apache/spark/sql/hive/ListTablesSuite.scala   |  2 +-
 .../org/apache/spark/sql/hive/UDFSuite.scala  |  6 +++---
 .../sql/hive/execution/HiveComparisonTest.scala   |  4 ++--
 .../sql/hive/execution/HiveResolutionSuite.scala  |  6 +++---
 .../sql/hive/execution/HiveTableScanSuite.scala   |  2 +-
 .../spark/sql/hive/execution/SQLQuerySuite.scala  |  2 +-
 .../org/apache/spark/sql/hive/parquetSuites.scala |  4 ++--
 .../org/apache/spark/deploy/yarn/Client.scala |  6 +++---
 .../yarn/ClientDistributedCacheManager.scala  | 18 +-
 .../apache/spark/deploy/yarn/ClientSuite.scala|  2 +-
 19 files changed, 52 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/55dc7a69/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
--
diff --git 
a/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
 
b/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
index ea45b14..7ad43b1 100644
--- 
a/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
+++ 
b/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
@@ -143,7 +143,7 @@ private class TransactionProcessor(val channel: Channel, 
val seqNum: String,
   eventBatch.setErrorMsg(msg)
 } else {
   // At this point, the events are available, so fill them into the 
event batch
-  eventBatch = new EventBatch(,seqNum, events)
+  eventBatch = new EventBatch(, seqNum, events)
 }
   })
 } catch {

http://git-wip-us.apache.org/repos/asf/spark/blob/55dc7a69/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
--
diff --git 
a/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
 
b/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
index dc629df..65c49c1 100644
--- 
a/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
+++ 
b/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
@@ -60,7 +60,7 @@ private[streaming] object EventTransformer extends Logging {
 out.write(body)
 val numHeaders = headers.size()
 out.writeInt(numHeaders)
-for ((k,v) - headers) {
+for ((k, v) - headers) {
   val keyBuff = Utils.serialize(k.toString)
   out.writeInt(keyBuff.length)
   out.write(keyBuff)

http://git-wip-us.apache.org/repos/asf/spark/blob/55dc7a69/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaRDDSuite.scala

spark git commit: [SPARK-7929] Turn whitespace checker on for more token types.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 36067ce39 - 97a60cf75


[SPARK-7929] Turn whitespace checker on for more token types.

This is the last batch of changes to complete SPARK-7929.

Previous related PRs:
https://github.com/apache/spark/pull/6480
https://github.com/apache/spark/pull/6478
https://github.com/apache/spark/pull/6477
https://github.com/apache/spark/pull/6476
https://github.com/apache/spark/pull/6475
https://github.com/apache/spark/pull/6474
https://github.com/apache/spark/pull/6473

Author: Reynold Xin r...@databricks.com

Closes #6487 from rxin/whitespace-lint and squashes the following commits:

b33d43d [Reynold Xin] [SPARK-7929] Turn whitespace checker on for more token 
types.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/97a60cf7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/97a60cf7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/97a60cf7

Branch: refs/heads/master
Commit: 97a60cf75d1fed654953eccedd04f3442389c5ca
Parents: 36067ce
Author: Reynold Xin r...@databricks.com
Authored: Thu May 28 23:00:02 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Thu May 28 23:00:02 2015 -0700

--
 .../flume/sink/TransactionProcessor.scala |  2 +-
 .../spark/streaming/flume/EventTransformer.scala  |  2 +-
 .../spark/streaming/kafka/KafkaRDDSuite.scala |  2 +-
 .../spark/streaming/mqtt/MQTTInputDStream.scala   | 14 +-
 .../examples/streaming/KinesisWordCountASL.scala  |  2 +-
 .../spark/streaming/kinesis/KinesisUtils.scala|  4 ++--
 scalastyle-config.xml | 13 -
 .../spark/sql/hive/HiveInspectorSuite.scala   | 12 ++--
 .../spark/sql/hive/InsertIntoHiveTableSuite.scala |  2 +-
 .../apache/spark/sql/hive/ListTablesSuite.scala   |  2 +-
 .../org/apache/spark/sql/hive/UDFSuite.scala  |  6 +++---
 .../sql/hive/execution/HiveComparisonTest.scala   |  4 ++--
 .../sql/hive/execution/HiveResolutionSuite.scala  |  6 +++---
 .../sql/hive/execution/HiveTableScanSuite.scala   |  2 +-
 .../spark/sql/hive/execution/SQLQuerySuite.scala  |  2 +-
 .../org/apache/spark/sql/hive/parquetSuites.scala |  4 ++--
 .../org/apache/spark/deploy/yarn/Client.scala |  6 +++---
 .../yarn/ClientDistributedCacheManager.scala  | 18 +-
 .../apache/spark/deploy/yarn/ClientSuite.scala|  2 +-
 19 files changed, 52 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/97a60cf7/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
--
diff --git 
a/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
 
b/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
index ea45b14..7ad43b1 100644
--- 
a/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
+++ 
b/external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink/TransactionProcessor.scala
@@ -143,7 +143,7 @@ private class TransactionProcessor(val channel: Channel, 
val seqNum: String,
   eventBatch.setErrorMsg(msg)
 } else {
   // At this point, the events are available, so fill them into the 
event batch
-  eventBatch = new EventBatch(,seqNum, events)
+  eventBatch = new EventBatch(, seqNum, events)
 }
   })
 } catch {

http://git-wip-us.apache.org/repos/asf/spark/blob/97a60cf7/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
--
diff --git 
a/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
 
b/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
index dc629df..65c49c1 100644
--- 
a/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
+++ 
b/external/flume/src/main/scala/org/apache/spark/streaming/flume/EventTransformer.scala
@@ -60,7 +60,7 @@ private[streaming] object EventTransformer extends Logging {
 out.write(body)
 val numHeaders = headers.size()
 out.writeInt(numHeaders)
-for ((k,v) - headers) {
+for ((k, v) - headers) {
   val keyBuff = Utils.serialize(k.toString)
   out.writeInt(keyBuff.length)
   out.write(keyBuff)

http://git-wip-us.apache.org/repos/asf/spark/blob/97a60cf7/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaRDDSuite.scala
--
diff --git 
a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaRDDSuite.scala
 

Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [deleted] 2d97d7a0a

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7863] [CORE] Create SimpleDateFormat for every SimpleDateParam instance because it's not thread-safe

2015-05-29 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master bf4658070 - 8db40f671


[SPARK-7863] [CORE] Create SimpleDateFormat for every SimpleDateParam instance 
because it's not thread-safe

SimpleDateFormat is not thread-safe. This PR creates new `SimpleDateFormat` for 
each `SimpleDateParam` instance.

Author: zsxwing zsxw...@gmail.com

Closes #6406 from zsxwing/SPARK-7863 and squashes the following commits:

aeed4c1 [zsxwing] Rewrite SimpleDateParam
8cdd986 [zsxwing] Inline formats
9680a15 [zsxwing] Create SimpleDateFormat for each SimpleDateParam instance 
because it's not thread-safe


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8db40f67
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8db40f67
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8db40f67

Branch: refs/heads/master
Commit: 8db40f6711058c3c3bf67ceaaacc25d67d19
Parents: bf46580
Author: zsxwing zsxw...@gmail.com
Authored: Fri May 29 05:17:41 2015 -0400
Committer: Sean Owen so...@cloudera.com
Committed: Fri May 29 05:17:41 2015 -0400

--
 .../spark/status/api/v1/SimpleDateParam.scala   | 49 +---
 .../status/api/v1/SimpleDateParamSuite.scala|  5 ++
 2 files changed, 26 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8db40f67/core/src/main/scala/org/apache/spark/status/api/v1/SimpleDateParam.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/status/api/v1/SimpleDateParam.scala 
b/core/src/main/scala/org/apache/spark/status/api/v1/SimpleDateParam.scala
index cee2978..0c71cd2 100644
--- a/core/src/main/scala/org/apache/spark/status/api/v1/SimpleDateParam.scala
+++ b/core/src/main/scala/org/apache/spark/status/api/v1/SimpleDateParam.scala
@@ -16,40 +16,33 @@
  */
 package org.apache.spark.status.api.v1
 
-import java.text.SimpleDateFormat
+import java.text.{ParseException, SimpleDateFormat}
 import java.util.TimeZone
 import javax.ws.rs.WebApplicationException
 import javax.ws.rs.core.Response
 import javax.ws.rs.core.Response.Status
 
-import scala.util.Try
-
 private[v1] class SimpleDateParam(val originalValue: String) {
-  val timestamp: Long = {
-SimpleDateParam.formats.collectFirst {
-  case fmt if Try(fmt.parse(originalValue)).isSuccess =
-fmt.parse(originalValue).getTime()
-}.getOrElse(
-  throw new WebApplicationException(
-Response
-  .status(Status.BAD_REQUEST)
-  .entity(Couldn't parse date:  + originalValue)
-  .build()
-  )
-)
-  }
-}
 
-private[v1] object SimpleDateParam {
-
-  val formats: Seq[SimpleDateFormat] = {
-
-val gmtDay = new SimpleDateFormat(-MM-dd)
-gmtDay.setTimeZone(TimeZone.getTimeZone(GMT))
-
-Seq(
-  new SimpleDateFormat(-MM-dd'T'HH:mm:ss.SSSz),
-  gmtDay
-)
+  val timestamp: Long = {
+val format = new SimpleDateFormat(-MM-dd'T'HH:mm:ss.SSSz)
+try {
+  format.parse(originalValue).getTime()
+} catch {
+  case _: ParseException =
+val gmtDay = new SimpleDateFormat(-MM-dd)
+gmtDay.setTimeZone(TimeZone.getTimeZone(GMT))
+try {
+  gmtDay.parse(originalValue).getTime()
+} catch {
+  case _: ParseException =
+throw new WebApplicationException(
+  Response
+.status(Status.BAD_REQUEST)
+.entity(Couldn't parse date:  + originalValue)
+.build()
+)
+}
+}
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/8db40f67/core/src/test/scala/org/apache/spark/status/api/v1/SimpleDateParamSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/status/api/v1/SimpleDateParamSuite.scala 
b/core/src/test/scala/org/apache/spark/status/api/v1/SimpleDateParamSuite.scala
index 731d1f5..183043b 100644
--- 
a/core/src/test/scala/org/apache/spark/status/api/v1/SimpleDateParamSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/status/api/v1/SimpleDateParamSuite.scala
@@ -16,6 +16,8 @@
  */
 package org.apache.spark.status.api.v1
 
+import javax.ws.rs.WebApplicationException
+
 import org.scalatest.{Matchers, FunSuite}
 
 class SimpleDateParamSuite extends FunSuite with Matchers {
@@ -24,6 +26,9 @@ class SimpleDateParamSuite extends FunSuite with Matchers {
 new SimpleDateParam(2015-02-20T23:21:17.190GMT).timestamp should be 
(1424474477190L)
 new SimpleDateParam(2015-02-20T17:21:17.190EST).timestamp should be 
(1424470877190L)
 new SimpleDateParam(2015-02-20).timestamp should be (142439040L) // 
GMT
+intercept[WebApplicationException] {
+  new SimpleDateParam(invalid date)
+}
   }
 
 }



spark git commit: [SPARK-7756] [CORE] Use testing cipher suites common to Oracle and IBM security providers

2015-05-29 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 23452be94 - bf4658070


[SPARK-7756] [CORE] Use testing cipher suites common to Oracle and IBM security 
providers

Add alias names for supported cipher suites to the sample SSL configuration.

The IBM JSSE provider reports its cipher suite with an SSL_ prefix, but accepts 
TLS_ prefixed suite names as an alias.  However, Jetty filters the requested 
ciphers based on the provider's reported supported suites, so the TLS_ versions 
are never passed through to JSSE causing an SSL handshake failure.

Author: Tim Ellison t.p.elli...@gmail.com

Closes #6282 from tellison/SSLFailure and squashes the following commits:

8de8a3e [Tim Ellison] Update SecurityManagerSuite with new expected suite names
96158b2 [Tim Ellison] Update the sample configs to use ciphers that are common 
to both the Oracle and IBM security providers.
705421b [Tim Ellison] Merge branch 'master' of github.com:tellison/spark into 
SSLFailure
68b9425 [Tim Ellison] Merge branch 'master' of https://github.com/apache/spark 
into SSLFailure
b0c35f6 [Tim Ellison] [CORE] Add aliases used for cipher suites in IBM provider


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bf465807
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bf465807
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bf465807

Branch: refs/heads/master
Commit: bf46580708e41a1d48ac091adbca8d82a4008699
Parents: 23452be
Author: Tim Ellison t.p.elli...@gmail.com
Authored: Fri May 29 05:14:43 2015 -0400
Committer: Sean Owen so...@cloudera.com
Committed: Fri May 29 05:14:43 2015 -0400

--
 core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala | 4 ++--
 core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bf465807/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala 
b/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
index 308b9ea..1a099da 100644
--- a/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
+++ b/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
@@ -34,7 +34,7 @@ object SSLSampleConfigs {
 conf.set(spark.ssl.trustStore, trustStorePath)
 conf.set(spark.ssl.trustStorePassword, password)
 conf.set(spark.ssl.enabledAlgorithms,
-  TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA)
+  SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA)
 conf.set(spark.ssl.protocol, TLSv1)
 conf
   }
@@ -48,7 +48,7 @@ object SSLSampleConfigs {
 conf.set(spark.ssl.trustStore, trustStorePath)
 conf.set(spark.ssl.trustStorePassword, password)
 conf.set(spark.ssl.enabledAlgorithms,
-  TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA)
+  SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA)
 conf.set(spark.ssl.protocol, TLSv1)
 conf
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/bf465807/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala 
b/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
index 62cb764..61571be 100644
--- a/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
@@ -147,7 +147,7 @@ class SecurityManagerSuite extends FunSuite {
 assert(securityManager.fileServerSSLOptions.keyPassword === 
Some(password))
 assert(securityManager.fileServerSSLOptions.protocol === Some(TLSv1))
 assert(securityManager.fileServerSSLOptions.enabledAlgorithms ===
-Set(TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA))
+Set(SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA))
 
 assert(securityManager.akkaSSLOptions.trustStore.isDefined === true)
 assert(securityManager.akkaSSLOptions.trustStore.get.getName === 
truststore)
@@ -158,7 +158,7 @@ class SecurityManagerSuite extends FunSuite {
 assert(securityManager.akkaSSLOptions.keyPassword === Some(password))
 assert(securityManager.akkaSSLOptions.protocol === Some(TLSv1))
 assert(securityManager.akkaSSLOptions.enabledAlgorithms ===
-Set(TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA))
+Set(SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA))
   }
 
   test(ssl off setup) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7756] [CORE] Use testing cipher suites common to Oracle and IBM security providers

2015-05-29 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 509a7cafc - 459c3d22e


[SPARK-7756] [CORE] Use testing cipher suites common to Oracle and IBM security 
providers

Add alias names for supported cipher suites to the sample SSL configuration.

The IBM JSSE provider reports its cipher suite with an SSL_ prefix, but accepts 
TLS_ prefixed suite names as an alias.  However, Jetty filters the requested 
ciphers based on the provider's reported supported suites, so the TLS_ versions 
are never passed through to JSSE causing an SSL handshake failure.

Author: Tim Ellison t.p.elli...@gmail.com

Closes #6282 from tellison/SSLFailure and squashes the following commits:

8de8a3e [Tim Ellison] Update SecurityManagerSuite with new expected suite names
96158b2 [Tim Ellison] Update the sample configs to use ciphers that are common 
to both the Oracle and IBM security providers.
705421b [Tim Ellison] Merge branch 'master' of github.com:tellison/spark into 
SSLFailure
68b9425 [Tim Ellison] Merge branch 'master' of https://github.com/apache/spark 
into SSLFailure
b0c35f6 [Tim Ellison] [CORE] Add aliases used for cipher suites in IBM provider

(cherry picked from commit bf46580708e41a1d48ac091adbca8d82a4008699)
Signed-off-by: Sean Owen so...@cloudera.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/459c3d22
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/459c3d22
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/459c3d22

Branch: refs/heads/branch-1.4
Commit: 459c3d22e0b520f0db21d471e29bdc6c4ec0029a
Parents: 509a7ca
Author: Tim Ellison t.p.elli...@gmail.com
Authored: Fri May 29 05:14:43 2015 -0400
Committer: Sean Owen so...@cloudera.com
Committed: Fri May 29 05:15:00 2015 -0400

--
 core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala | 4 ++--
 core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/459c3d22/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala 
b/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
index 308b9ea..1a099da 100644
--- a/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
+++ b/core/src/test/scala/org/apache/spark/SSLSampleConfigs.scala
@@ -34,7 +34,7 @@ object SSLSampleConfigs {
 conf.set(spark.ssl.trustStore, trustStorePath)
 conf.set(spark.ssl.trustStorePassword, password)
 conf.set(spark.ssl.enabledAlgorithms,
-  TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA)
+  SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA)
 conf.set(spark.ssl.protocol, TLSv1)
 conf
   }
@@ -48,7 +48,7 @@ object SSLSampleConfigs {
 conf.set(spark.ssl.trustStore, trustStorePath)
 conf.set(spark.ssl.trustStorePassword, password)
 conf.set(spark.ssl.enabledAlgorithms,
-  TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA)
+  SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA)
 conf.set(spark.ssl.protocol, TLSv1)
 conf
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/459c3d22/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala 
b/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
index 62cb764..61571be 100644
--- a/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala
@@ -147,7 +147,7 @@ class SecurityManagerSuite extends FunSuite {
 assert(securityManager.fileServerSSLOptions.keyPassword === 
Some(password))
 assert(securityManager.fileServerSSLOptions.protocol === Some(TLSv1))
 assert(securityManager.fileServerSSLOptions.enabledAlgorithms ===
-Set(TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA))
+Set(SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA))
 
 assert(securityManager.akkaSSLOptions.trustStore.isDefined === true)
 assert(securityManager.akkaSSLOptions.trustStore.get.getName === 
truststore)
@@ -158,7 +158,7 @@ class SecurityManagerSuite extends FunSuite {
 assert(securityManager.akkaSSLOptions.keyPassword === Some(password))
 assert(securityManager.akkaSSLOptions.protocol === Some(TLSv1))
 assert(securityManager.akkaSSLOptions.enabledAlgorithms ===
-Set(TLS_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_DES_CBC_SHA))
+Set(SSL_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_DES_CBC_SHA))
   }
 
   test(ssl off setup) {


-
To unsubscribe, e-mail: 

Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [deleted] f2796816b

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two configs with different way in different modes

2015-05-29 Thread tgraves
Repository: spark
Updated Branches:
  refs/heads/master 8db40f671 - a51b133de


[SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two 
configs with different way in different modes

* As spark now supports long running service by updating tokens for namenode, 
but only accept parameters passed with --k=v format which is not very 
convinient. This patch add spark.* configs in properties file and system 
property.

*  --principal and --keytabl options are passed to client but when we started 
thrift server or spark-shell these two are also passed into the Main class 
(org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and 
org.apache.spark.repl.Main).
In these two main class, arguments passed in will be processed with some 3rd 
libraries, which will lead to some error: Invalid option: --principal or 
Unrecgnised option: --principal.
We should pass these command args in different forms, say system properties.

Author: WangTaoTheTonic wangtao...@huawei.com

Closes #6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:

e65699a [WangTaoTheTonic] change logic to loadEnvironments
ebd9ea0 [WangTaoTheTonic] merge master
ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
33a7f40 [WangTaoTheTonic] expand the use of the current configs
08bb4e8 [WangTaoTheTonic] fix wrong cite
73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals 
to internal


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a51b133d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a51b133d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a51b133d

Branch: refs/heads/master
Commit: a51b133de3c65a991ab105b6f020082080121b4c
Parents: 8db40f6
Author: WangTaoTheTonic wangtao...@huawei.com
Authored: Fri May 29 11:06:11 2015 -0500
Committer: Thomas Graves tgra...@thatenemy-lm.champ.corp.yahoo.com
Committed: Fri May 29 11:06:11 2015 -0500

--
 .../scala/org/apache/spark/deploy/SparkSubmit.scala |  8 
 .../apache/spark/deploy/SparkSubmitArguments.scala  |  2 ++
 docs/running-on-yarn.md | 16 
 .../deploy/yarn/AMDelegationTokenRenewer.scala  | 14 --
 .../apache/spark/deploy/yarn/ClientArguments.scala  |  6 ++
 5 files changed, 36 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a51b133d/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 92bb505..d1b32ea 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -428,6 +428,8 @@ object SparkSubmit {
   OptionAssigner(args.executorCores, YARN, CLIENT, sysProp = 
spark.executor.cores),
   OptionAssigner(args.files, YARN, CLIENT, sysProp = 
spark.yarn.dist.files),
   OptionAssigner(args.archives, YARN, CLIENT, sysProp = 
spark.yarn.dist.archives),
+  OptionAssigner(args.principal, YARN, CLIENT, sysProp = 
spark.yarn.principal),
+  OptionAssigner(args.keytab, YARN, CLIENT, sysProp = spark.yarn.keytab),
 
   // Yarn cluster only
   OptionAssigner(args.name, YARN, CLUSTER, clOption = --name),
@@ -440,10 +442,8 @@ object SparkSubmit {
   OptionAssigner(args.files, YARN, CLUSTER, clOption = --files),
   OptionAssigner(args.archives, YARN, CLUSTER, clOption = --archives),
   OptionAssigner(args.jars, YARN, CLUSTER, clOption = --addJars),
-
-  // Yarn client or cluster
-  OptionAssigner(args.principal, YARN, ALL_DEPLOY_MODES, clOption = 
--principal),
-  OptionAssigner(args.keytab, YARN, ALL_DEPLOY_MODES, clOption = 
--keytab),
+  OptionAssigner(args.principal, YARN, CLUSTER, clOption = --principal),
+  OptionAssigner(args.keytab, YARN, CLUSTER, clOption = --keytab),
 
   // Other options
   OptionAssigner(args.executorCores, STANDALONE, ALL_DEPLOY_MODES,

http://git-wip-us.apache.org/repos/asf/spark/blob/a51b133d/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index c0e4c77..cc6a7bd 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
@@ -169,6 +169,8 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 deployMode = 

spark git commit: HOTFIX: Scala style checker failure due to a missing space in TachyonBlockManager.scala.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 459c3d22e - 23bd05fff


HOTFIX: Scala style checker failure due to a missing space in 
TachyonBlockManager.scala.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/23bd05ff
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/23bd05ff
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/23bd05ff

Branch: refs/heads/branch-1.4
Commit: 23bd05fff78ae4adbd7dd4f3edf4eea6ac63139d
Parents: 459c3d2
Author: Reynold Xin r...@databricks.com
Authored: Fri May 29 09:37:46 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 09:37:46 2015 -0700

--
 .../main/scala/org/apache/spark/storage/TachyonBlockManager.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/23bd05ff/core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala 
b/core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala
index bdc6276..d3b8c80 100644
--- a/core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala
+++ b/core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala
@@ -38,7 +38,7 @@ import org.apache.spark.util.Utils
  */
 private[spark] class TachyonBlockManager() extends ExternalBlockManager with 
Logging {
 
-  var blockManager: BlockManager =_
+  var blockManager: BlockManager = _
   var rootDirs: String = _
   var master: String = _
   var client: tachyon.client.TachyonFS = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/master 4782e1304 - 6181937f3


[SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

Author: MechCoder manojkumarsivaraj...@gmail.com

Closes #6497 from MechCoder/spark-7946 and squashes the following commits:

2fdd0a3 [MechCoder] Add non-regression test
8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6181937f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6181937f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6181937f

Branch: refs/heads/master
Commit: 6181937f315480543d28e542d43269cfa591e9d0
Parents: 4782e13
Author: MechCoder manojkumarsivaraj...@gmail.com
Authored: Fri May 29 11:36:41 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 11:36:41 2015 -0700

--
 .../org/apache/spark/mllib/clustering/StreamingKMeans.scala   | 2 +-
 .../apache/spark/mllib/clustering/StreamingKMeansSuite.scala  | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6181937f/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
index 812014a..c21e4fe 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
@@ -178,7 +178,7 @@ class StreamingKMeans(
 
   /** Set the decay factor directly (for forgetful algorithms). */
   def setDecayFactor(a: Double): this.type = {
-this.decayFactor = decayFactor
+this.decayFactor = a
 this
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/6181937f/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
index f90025d..13f9b17 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
@@ -133,6 +133,13 @@ class StreamingKMeansSuite extends FunSuite with 
TestSuiteBase {
 assert(math.abs(c1) ~== 0.8 absTol 0.6)
   }
 
+  test(SPARK-7946 setDecayFactor) {
+val kMeans = new StreamingKMeans()
+assert(kMeans.decayFactor === 1.0)
+kMeans.setDecayFactor(2.0)
+assert(kMeans.decayFactor === 2.0)
+  }
+
   def StreamingKMeansDataGenerator(
   numPoints: Int,
   numBatches: Int,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 645e61164 - 4be701aa5


[SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

Author: MechCoder manojkumarsivaraj...@gmail.com

Closes #6497 from MechCoder/spark-7946 and squashes the following commits:

2fdd0a3 [MechCoder] Add non-regression test
8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans

(cherry picked from commit 6181937f315480543d28e542d43269cfa591e9d0)
Signed-off-by: Xiangrui Meng m...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4be701aa
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4be701aa
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4be701aa

Branch: refs/heads/branch-1.4
Commit: 4be701aa50711ff22a0a0d7a87c9857ea2ef8e22
Parents: 645e611
Author: MechCoder manojkumarsivaraj...@gmail.com
Authored: Fri May 29 11:36:41 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 11:36:48 2015 -0700

--
 .../org/apache/spark/mllib/clustering/StreamingKMeans.scala   | 2 +-
 .../apache/spark/mllib/clustering/StreamingKMeansSuite.scala  | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4be701aa/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
index 812014a..c21e4fe 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
@@ -178,7 +178,7 @@ class StreamingKMeans(
 
   /** Set the decay factor directly (for forgetful algorithms). */
   def setDecayFactor(a: Double): this.type = {
-this.decayFactor = decayFactor
+this.decayFactor = a
 this
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/4be701aa/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
index f90025d..13f9b17 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
@@ -133,6 +133,13 @@ class StreamingKMeansSuite extends FunSuite with 
TestSuiteBase {
 assert(math.abs(c1) ~== 0.8 absTol 0.6)
   }
 
+  test(SPARK-7946 setDecayFactor) {
+val kMeans = new StreamingKMeans()
+assert(kMeans.decayFactor === 1.0)
+kMeans.setDecayFactor(2.0)
+assert(kMeans.decayFactor === 2.0)
+  }
+
   def StreamingKMeansDataGenerator(
   numPoints: Int,
   numBatches: Int,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/branch-1.3 d09a053ec - ad5daa3a3


[SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

Author: MechCoder manojkumarsivaraj...@gmail.com

Closes #6497 from MechCoder/spark-7946 and squashes the following commits:

2fdd0a3 [MechCoder] Add non-regression test
8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans

(cherry picked from commit 6181937f315480543d28e542d43269cfa591e9d0)
Signed-off-by: Xiangrui Meng m...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ad5daa3a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ad5daa3a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ad5daa3a

Branch: refs/heads/branch-1.3
Commit: ad5daa3a3ee3f9129f5cb6eb0ad4be246db448b9
Parents: d09a053
Author: MechCoder manojkumarsivaraj...@gmail.com
Authored: Fri May 29 11:36:41 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 11:36:58 2015 -0700

--
 .../org/apache/spark/mllib/clustering/StreamingKMeans.scala   | 2 +-
 .../apache/spark/mllib/clustering/StreamingKMeansSuite.scala  | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ad5daa3a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
index d4606fd..020e76e 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
@@ -179,7 +179,7 @@ class StreamingKMeans(
 
   /** Set the decay factor directly (for forgetful algorithms). */
   def setDecayFactor(a: Double): this.type = {
-this.decayFactor = decayFactor
+this.decayFactor = a
 this
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/ad5daa3a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
index 850c9fc..83ee4a9 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
@@ -133,6 +133,13 @@ class StreamingKMeansSuite extends FunSuite with 
TestSuiteBase {
 assert(math.abs(c1) ~== 0.8 absTol 0.6)
   }
 
+  test(SPARK-7946 setDecayFactor) {
+val kMeans = new StreamingKMeans()
+assert(kMeans.decayFactor === 1.0)
+kMeans.setDecayFactor(2.0)
+assert(kMeans.decayFactor === 2.0)
+  }
+
   def StreamingKMeansDataGenerator(
   numPoints: Int,
   numBatches: Int,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext()

2015-05-29 Thread yhuai
Repository: spark
Updated Branches:
  refs/heads/master a51b133de - e7b617755


[SPARK-7950] [SQL] Sets spark.sql.hive.version in 
HiveThriftServer2.startWithContext()

When starting `HiveThriftServer2` via `startWithContext`, property 
`spark.sql.hive.version` isn't set. This causes Simba ODBC driver 1.0.8.1006 
behaves differently and fails simple queries.

Hive2 JDBC driver works fine in this case. Also, when starting the server with 
`start-thriftserver.sh`, both Hive2 JDBC driver and Simba ODBC driver works 
fine.

Please refer to [SPARK-7950] [1] for details.

[1]: https://issues.apache.org/jira/browse/SPARK-7950

Author: Cheng Lian l...@databricks.com

Closes #6500 from liancheng/odbc-bugfix and squashes the following commits:

051e3a3 [Cheng Lian] Fixes import order
3a97376 [Cheng Lian] Sets spark.sql.hive.version in 
HiveThriftServer2.startWithContext()


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e7b61775
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e7b61775
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e7b61775

Branch: refs/heads/master
Commit: e7b61775571ce7a06d044bc3a6055ff94c7477d6
Parents: a51b133
Author: Cheng Lian l...@databricks.com
Authored: Fri May 29 10:43:34 2015 -0700
Committer: Yin Huai yh...@databricks.com
Committed: Fri May 29 10:43:34 2015 -0700

--
 .../sql/hive/thriftserver/HiveThriftServer2.scala| 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e7b61775/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
--
diff --git 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
index 3458b04..94687ee 100644
--- 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
+++ 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
@@ -17,23 +17,23 @@
 
 package org.apache.spark.sql.hive.thriftserver
 
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
 import org.apache.commons.logging.LogFactory
 import org.apache.hadoop.hive.conf.HiveConf
 import org.apache.hadoop.hive.conf.HiveConf.ConfVars
 import org.apache.hive.service.cli.thrift.{ThriftBinaryCLIService, 
ThriftHttpCLIService}
 import org.apache.hive.service.server.{HiveServer2, ServerOptionsProcessor}
-import org.apache.spark.sql.SQLConf
 
-import org.apache.spark.{SparkContext, SparkConf, Logging}
 import org.apache.spark.annotation.DeveloperApi
-import org.apache.spark.sql.hive.HiveContext
+import org.apache.spark.scheduler.{SparkListener, SparkListenerApplicationEnd, 
SparkListenerJobStart}
+import org.apache.spark.sql.SQLConf
 import org.apache.spark.sql.hive.thriftserver.ReflectionUtils._
-import org.apache.spark.scheduler.{SparkListenerJobStart, 
SparkListenerApplicationEnd, SparkListener}
 import org.apache.spark.sql.hive.thriftserver.ui.ThriftServerTab
+import org.apache.spark.sql.hive.{HiveContext, HiveShim}
 import org.apache.spark.util.Utils
-
-import scala.collection.mutable
-import scala.collection.mutable.ArrayBuffer
+import org.apache.spark.{Logging, SparkContext}
 
 /**
  * The main entry point for the Spark SQL port of HiveServer2.  Starts up a 
`SparkSQLContext` and a
@@ -51,6 +51,7 @@ object HiveThriftServer2 extends Logging {
   @DeveloperApi
   def startWithContext(sqlContext: HiveContext): Unit = {
 val server = new HiveThriftServer2(sqlContext)
+sqlContext.setConf(spark.sql.hive.version, HiveShim.version)
 server.init(sqlContext.hiveconf)
 server.start()
 listener = new HiveThriftServer2Listener(server, sqlContext.conf)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext()

2015-05-29 Thread yhuai
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 23bd05fff - caea7a618


[SPARK-7950] [SQL] Sets spark.sql.hive.version in 
HiveThriftServer2.startWithContext()

When starting `HiveThriftServer2` via `startWithContext`, property 
`spark.sql.hive.version` isn't set. This causes Simba ODBC driver 1.0.8.1006 
behaves differently and fails simple queries.

Hive2 JDBC driver works fine in this case. Also, when starting the server with 
`start-thriftserver.sh`, both Hive2 JDBC driver and Simba ODBC driver works 
fine.

Please refer to [SPARK-7950] [1] for details.

[1]: https://issues.apache.org/jira/browse/SPARK-7950

Author: Cheng Lian l...@databricks.com

Closes #6500 from liancheng/odbc-bugfix and squashes the following commits:

051e3a3 [Cheng Lian] Fixes import order
3a97376 [Cheng Lian] Sets spark.sql.hive.version in 
HiveThriftServer2.startWithContext()

(cherry picked from commit e7b61775571ce7a06d044bc3a6055ff94c7477d6)
Signed-off-by: Yin Huai yh...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/caea7a61
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/caea7a61
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/caea7a61

Branch: refs/heads/branch-1.4
Commit: caea7a618db7989a37ee59fcf928678efadba3e0
Parents: 23bd05f
Author: Cheng Lian l...@databricks.com
Authored: Fri May 29 10:43:34 2015 -0700
Committer: Yin Huai yh...@databricks.com
Committed: Fri May 29 10:43:44 2015 -0700

--
 .../sql/hive/thriftserver/HiveThriftServer2.scala| 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/caea7a61/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
--
diff --git 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
index 3458b04..94687ee 100644
--- 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
+++ 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala
@@ -17,23 +17,23 @@
 
 package org.apache.spark.sql.hive.thriftserver
 
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
 import org.apache.commons.logging.LogFactory
 import org.apache.hadoop.hive.conf.HiveConf
 import org.apache.hadoop.hive.conf.HiveConf.ConfVars
 import org.apache.hive.service.cli.thrift.{ThriftBinaryCLIService, 
ThriftHttpCLIService}
 import org.apache.hive.service.server.{HiveServer2, ServerOptionsProcessor}
-import org.apache.spark.sql.SQLConf
 
-import org.apache.spark.{SparkContext, SparkConf, Logging}
 import org.apache.spark.annotation.DeveloperApi
-import org.apache.spark.sql.hive.HiveContext
+import org.apache.spark.scheduler.{SparkListener, SparkListenerApplicationEnd, 
SparkListenerJobStart}
+import org.apache.spark.sql.SQLConf
 import org.apache.spark.sql.hive.thriftserver.ReflectionUtils._
-import org.apache.spark.scheduler.{SparkListenerJobStart, 
SparkListenerApplicationEnd, SparkListener}
 import org.apache.spark.sql.hive.thriftserver.ui.ThriftServerTab
+import org.apache.spark.sql.hive.{HiveContext, HiveShim}
 import org.apache.spark.util.Utils
-
-import scala.collection.mutable
-import scala.collection.mutable.ArrayBuffer
+import org.apache.spark.{Logging, SparkContext}
 
 /**
  * The main entry point for the Spark SQL port of HiveServer2.  Starts up a 
`SparkSQLContext` and a
@@ -51,6 +51,7 @@ object HiveThriftServer2 extends Logging {
   @DeveloperApi
   def startWithContext(sqlContext: HiveContext): Unit = {
 val server = new HiveThriftServer2(sqlContext)
+sqlContext.setConf(spark.sql.hive.version, HiveShim.version)
 server.init(sqlContext.hiveconf)
 server.start()
 listener = new HiveThriftServer2Listener(server, sqlContext.conf)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: HOTFIX: Scala style checker for DataTypeSuite.scala.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 caea7a618 - 62df047a3


HOTFIX: Scala style checker for DataTypeSuite.scala.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/62df047a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/62df047a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/62df047a

Branch: refs/heads/branch-1.4
Commit: 62df047a3660ffe026aa64dff8e9f096d994a8f3
Parents: caea7a6
Author: Reynold Xin r...@databricks.com
Authored: Fri May 29 11:06:33 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 11:06:33 2015 -0700

--
 .../src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/62df047a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
index 7ccc936..953debf 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
@@ -120,7 +120,7 @@ class DataTypeSuite extends FunSuite {
   checkDefaultSize(DecimalType(10, 5), 4096)
   checkDefaultSize(DecimalType.Unlimited, 4096)
   checkDefaultSize(DateType, 4)
-  checkDefaultSize(TimestampType,12)
+  checkDefaultSize(TimestampType, 12)
   checkDefaultSize(StringType, 4096)
   checkDefaultSize(BinaryType, 4096)
   checkDefaultSize(ArrayType(DoubleType, true), 800)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

2015-05-29 Thread meng
Repository: spark
Updated Branches:
  refs/heads/branch-1.2 c0a0eaacc - aefb113c8


[SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans

Author: MechCoder manojkumarsivaraj...@gmail.com

Closes #6497 from MechCoder/spark-7946 and squashes the following commits:

2fdd0a3 [MechCoder] Add non-regression test
8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans

(cherry picked from commit 6181937f315480543d28e542d43269cfa591e9d0)
Signed-off-by: Xiangrui Meng m...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/aefb113c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/aefb113c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/aefb113c

Branch: refs/heads/branch-1.2
Commit: aefb113c86cb3b0ab6da3a7ddd601b0caf4d762f
Parents: c0a0eaa
Author: MechCoder manojkumarsivaraj...@gmail.com
Authored: Fri May 29 11:36:41 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Fri May 29 11:37:09 2015 -0700

--
 .../org/apache/spark/mllib/clustering/StreamingKMeans.scala   | 2 +-
 .../apache/spark/mllib/clustering/StreamingKMeansSuite.scala  | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/aefb113c/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
index 5ab7e1a..9b10ce6 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/clustering/StreamingKMeans.scala
@@ -174,7 +174,7 @@ class StreamingKMeans(
 
   /** Set the decay factor directly (for forgetful algorithms). */
   def setDecayFactor(a: Double): this.type = {
-this.decayFactor = decayFactor
+this.decayFactor = a
 this
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/aefb113c/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
index 850c9fc..83ee4a9 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/clustering/StreamingKMeansSuite.scala
@@ -133,6 +133,13 @@ class StreamingKMeansSuite extends FunSuite with 
TestSuiteBase {
 assert(math.abs(c1) ~== 0.8 absTol 0.6)
   }
 
+  test(SPARK-7946 setDecayFactor) {
+val kMeans = new StreamingKMeans()
+assert(kMeans.decayFactor === 1.0)
+kMeans.setDecayFactor(2.0)
+assert(kMeans.decayFactor === 2.0)
+  }
+
   def StreamingKMeansDataGenerator(
   numPoints: Int,
   numBatches: Int,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init

2015-05-29 Thread davies
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 cf4122e4d - 2bd446054


[SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init

cc davies

Author: Shivaram Venkataraman shiva...@cs.berkeley.edu

Closes #6507 from shivaram/sparkr-init and squashes the following commits:

6fdd169 [Shivaram Venkataraman] Create SparkContext in sparkRSQL init

(cherry picked from commit 5fb97dca9bcfc29ac33823554c8783997e811b99)
Signed-off-by: Davies Liu dav...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2bd44605
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2bd44605
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2bd44605

Branch: refs/heads/branch-1.4
Commit: 2bd4460548913dbbfadc34c52f8318d6be8949e0
Parents: cf4122e
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Authored: Fri May 29 15:08:30 2015 -0700
Committer: Davies Liu dav...@databricks.com
Committed: Fri May 29 15:08:50 2015 -0700

--
 R/pkg/R/sparkR.R | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2bd44605/R/pkg/R/sparkR.R
--
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index 68387f0..5ced7c6 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -225,14 +225,21 @@ sparkR.init - function(
 #' sqlContext - sparkRSQL.init(sc)
 #'}
 
-sparkRSQL.init - function(jsc) {
+sparkRSQL.init - function(jsc = NULL) {
   if (exists(.sparkRSQLsc, envir = .sparkREnv)) {
 return(get(.sparkRSQLsc, envir = .sparkREnv))
   }
 
+  # If jsc is NULL, create a Spark Context
+  sc - if (is.null(jsc)) {
+sparkR.init()
+  } else {
+jsc
+  }
+
   sqlContext - callJStatic(org.apache.spark.sql.api.r.SQLUtils,
-createSQLContext,
-jsc)
+createSQLContext,
+sc)
   assign(.sparkRSQLsc, sqlContext, envir = .sparkREnv)
   sqlContext
 }
@@ -249,12 +256,19 @@ sparkRSQL.init - function(jsc) {
 #' sqlContext - sparkRHive.init(sc)
 #'}
 
-sparkRHive.init - function(jsc) {
+sparkRHive.init - function(jsc = NULL) {
   if (exists(.sparkRHivesc, envir = .sparkREnv)) {
 return(get(.sparkRHivesc, envir = .sparkREnv))
   }
 
-  ssc - callJMethod(jsc, sc)
+  # If jsc is NULL, create a Spark Context
+  sc - if (is.null(jsc)) {
+sparkR.init()
+  } else {
+jsc
+  }
+
+  ssc - callJMethod(sc, sc)
   hiveCtx - tryCatch({
 newJObject(org.apache.spark.sql.hive.HiveContext, ssc)
   }, error = function(err) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-6013] [ML] Add more Python ML examples for spark.ml

2015-05-29 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 2bd446054 - 9a88be183


[SPARK-6013] [ML] Add more Python ML examples for spark.ml

Author: Ram Sriharsha rsriharsha@hw11853.local

Closes #6443 from harsha2010/SPARK-6013 and squashes the following commits:

732506e [Ram Sriharsha] Code Review Feedback
121c211 [Ram Sriharsha] python style fix
5f9b8c3 [Ram Sriharsha] python style fixes
925ca86 [Ram Sriharsha] Simple Params Example
8b372b1 [Ram Sriharsha] GBT Example
965ec14 [Ram Sriharsha] Random Forest Example

(cherry picked from commit dbf8ff38de0f95f467b874a5b527dcf59439efe8)
Signed-off-by: Joseph K. Bradley jos...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9a88be18
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9a88be18
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9a88be18

Branch: refs/heads/branch-1.4
Commit: 9a88be18334405a2b3f43a8d5b4aefe5a63c3e61
Parents: 2bd4460
Author: Ram Sriharsha rsriharsha@hw11853.local
Authored: Fri May 29 15:22:26 2015 -0700
Committer: Joseph K. Bradley jos...@databricks.com
Committed: Fri May 29 15:22:38 2015 -0700

--
 .../examples/ml/JavaSimpleParamsExample.java|  2 +-
 .../main/python/ml/gradient_boosted_trees.py| 83 +
 .../src/main/python/ml/random_forest_example.py | 87 +
 .../src/main/python/ml/simple_params_example.py | 98 
 .../spark/examples/ml/SimpleParamsExample.scala |  2 +-
 5 files changed, 270 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9a88be18/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
--
diff --git 
a/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
 
b/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
index 29158d5..dac649d 100644
--- 
a/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
+++ 
b/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
@@ -97,7 +97,7 @@ public class JavaSimpleParamsExample {
 DataFrame test = jsql.createDataFrame(jsc.parallelize(localTest), 
LabeledPoint.class);
 
 // Make predictions on test documents using the Transformer.transform() 
method.
-// LogisticRegression.transform will only use the 'features' column.
+// LogisticRegressionModel.transform will only use the 'features' column.
 // Note that model2.transform() outputs a 'myProbability' column instead 
of the usual
 // 'probability' column since we renamed the lr.probabilityCol parameter 
previously.
 DataFrame results = model2.transform(test);

http://git-wip-us.apache.org/repos/asf/spark/blob/9a88be18/examples/src/main/python/ml/gradient_boosted_trees.py
--
diff --git a/examples/src/main/python/ml/gradient_boosted_trees.py 
b/examples/src/main/python/ml/gradient_boosted_trees.py
new file mode 100644
index 000..6446f0f
--- /dev/null
+++ b/examples/src/main/python/ml/gradient_boosted_trees.py
@@ -0,0 +1,83 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the License); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import print_function
+
+import sys
+
+from pyspark import SparkContext
+from pyspark.ml.classification import GBTClassifier
+from pyspark.ml.feature import StringIndexer
+from pyspark.ml.regression import GBTRegressor
+from pyspark.mllib.evaluation import BinaryClassificationMetrics, 
RegressionMetrics
+from pyspark.mllib.util import MLUtils
+from pyspark.sql import Row, SQLContext
+
+
+A simple example demonstrating a Gradient Boosted Trees 
Classification/Regression Pipeline.
+Note: GBTClassifier only supports binary classification currently
+Run with:
+  bin/spark-submit examples/src/main/python/ml/gradient_boosted_trees.py
+
+
+
+def testClassification(train, test):
+# Train a GradientBoostedTrees model.
+
+rf = GBTClassifier(maxIter=30, maxDepth=4, labelCol=indexedLabel)
+
+  

spark git commit: [SPARK-6013] [ML] Add more Python ML examples for spark.ml

2015-05-29 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/master 5fb97dca9 - dbf8ff38d


[SPARK-6013] [ML] Add more Python ML examples for spark.ml

Author: Ram Sriharsha rsriharsha@hw11853.local

Closes #6443 from harsha2010/SPARK-6013 and squashes the following commits:

732506e [Ram Sriharsha] Code Review Feedback
121c211 [Ram Sriharsha] python style fix
5f9b8c3 [Ram Sriharsha] python style fixes
925ca86 [Ram Sriharsha] Simple Params Example
8b372b1 [Ram Sriharsha] GBT Example
965ec14 [Ram Sriharsha] Random Forest Example


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dbf8ff38
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dbf8ff38
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dbf8ff38

Branch: refs/heads/master
Commit: dbf8ff38de0f95f467b874a5b527dcf59439efe8
Parents: 5fb97dc
Author: Ram Sriharsha rsriharsha@hw11853.local
Authored: Fri May 29 15:22:26 2015 -0700
Committer: Joseph K. Bradley jos...@databricks.com
Committed: Fri May 29 15:22:26 2015 -0700

--
 .../examples/ml/JavaSimpleParamsExample.java|  2 +-
 .../main/python/ml/gradient_boosted_trees.py| 83 +
 .../src/main/python/ml/random_forest_example.py | 87 +
 .../src/main/python/ml/simple_params_example.py | 98 
 .../spark/examples/ml/SimpleParamsExample.scala |  2 +-
 5 files changed, 270 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/dbf8ff38/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
--
diff --git 
a/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
 
b/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
index 29158d5..dac649d 100644
--- 
a/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
+++ 
b/examples/src/main/java/org/apache/spark/examples/ml/JavaSimpleParamsExample.java
@@ -97,7 +97,7 @@ public class JavaSimpleParamsExample {
 DataFrame test = jsql.createDataFrame(jsc.parallelize(localTest), 
LabeledPoint.class);
 
 // Make predictions on test documents using the Transformer.transform() 
method.
-// LogisticRegression.transform will only use the 'features' column.
+// LogisticRegressionModel.transform will only use the 'features' column.
 // Note that model2.transform() outputs a 'myProbability' column instead 
of the usual
 // 'probability' column since we renamed the lr.probabilityCol parameter 
previously.
 DataFrame results = model2.transform(test);

http://git-wip-us.apache.org/repos/asf/spark/blob/dbf8ff38/examples/src/main/python/ml/gradient_boosted_trees.py
--
diff --git a/examples/src/main/python/ml/gradient_boosted_trees.py 
b/examples/src/main/python/ml/gradient_boosted_trees.py
new file mode 100644
index 000..6446f0f
--- /dev/null
+++ b/examples/src/main/python/ml/gradient_boosted_trees.py
@@ -0,0 +1,83 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the License); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import print_function
+
+import sys
+
+from pyspark import SparkContext
+from pyspark.ml.classification import GBTClassifier
+from pyspark.ml.feature import StringIndexer
+from pyspark.ml.regression import GBTRegressor
+from pyspark.mllib.evaluation import BinaryClassificationMetrics, 
RegressionMetrics
+from pyspark.mllib.util import MLUtils
+from pyspark.sql import Row, SQLContext
+
+
+A simple example demonstrating a Gradient Boosted Trees 
Classification/Regression Pipeline.
+Note: GBTClassifier only supports binary classification currently
+Run with:
+  bin/spark-submit examples/src/main/python/ml/gradient_boosted_trees.py
+
+
+
+def testClassification(train, test):
+# Train a GradientBoostedTrees model.
+
+rf = GBTClassifier(maxIter=30, maxDepth=4, labelCol=indexedLabel)
+
+model = rf.fit(train)
+predictionAndLabels = model.transform(test).select(prediction, 
indexedLabel) \
+.map(lambda x: 

spark git commit: [HOT FIX] [BUILD] Fix maven build failures

2015-05-29 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 8c9979337 - a4f24123d


[HOT FIX] [BUILD] Fix maven build failures

This patch fixes a build break in maven caused by #6441.

Note that this patch reverts the changes in flume-sink because
this module does not currently depend on Spark core, but the
tests require it. There is not an easy way to make this work
because mvn test dependencies are not transitive (MNG-1378).

For now, we will leave the one test suite in flume-sink out
until we figure out a better solution. This patch is mainly
intended to unbreak the maven build.

Author: Andrew Or and...@databricks.com

Closes #6511 from andrewor14/fix-build-mvn and squashes the following commits:

3d53643 [Andrew Or] [HOT FIX #6441] Fix maven build failures


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a4f24123
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a4f24123
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a4f24123

Branch: refs/heads/master
Commit: a4f24123d8857656524c9138c7c067a4b1033a5e
Parents: 8c99793
Author: Andrew Or and...@databricks.com
Authored: Fri May 29 17:19:46 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Fri May 29 17:19:46 2015 -0700

--
 external/flume-sink/pom.xml   | 7 ---
 .../apache/spark/streaming/flume/sink/SparkSinkSuite.scala| 5 ++---
 mllib/pom.xml | 7 +++
 3 files changed, 9 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a4f24123/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index bb2ec96..1f3e619 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -36,13 +36,6 @@
 
   dependencies
 dependency
-  groupIdorg.apache.spark/groupId
-  artifactIdspark-core_${scala.binary.version}/artifactId
-  version${project.version}/version
-  typetest-jar/type
-  scopetest/scope
-/dependency
-dependency
   groupIdorg.apache.commons/groupId
   artifactIdcommons-lang3/artifactId
 /dependency

http://git-wip-us.apache.org/repos/asf/spark/blob/a4f24123/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
--
diff --git 
a/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
 
b/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
index e9fbcb9..650b2fb 100644
--- 
a/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
+++ 
b/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
@@ -31,10 +31,9 @@ import org.apache.flume.Context
 import org.apache.flume.channel.MemoryChannel
 import org.apache.flume.event.EventBuilder
 import org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory
+import org.scalatest.FunSuite
 
-import org.apache.spark.SparkFunSuite
-
-class SparkSinkSuite extends SparkFunSuite {
+class SparkSinkSuite extends FunSuite {
   val eventsPerBatch = 1000
   val channelCapacity = 5000
 

http://git-wip-us.apache.org/repos/asf/spark/blob/a4f24123/mllib/pom.xml
--
diff --git a/mllib/pom.xml b/mllib/pom.xml
index 0c07ca1..65c647a 100644
--- a/mllib/pom.xml
+++ b/mllib/pom.xml
@@ -42,6 +42,13 @@
 /dependency
 dependency
   groupIdorg.apache.spark/groupId
+  artifactIdspark-core_${scala.binary.version}/artifactId
+  version${project.version}/version
+  typetest-jar/type
+  scopetest/scope
+/dependency
+dependency
+  groupIdorg.apache.spark/groupId
   artifactIdspark-streaming_${scala.binary.version}/artifactId
   version${project.version}/version
 /dependency


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7910] [TINY] [JAVAAPI] expose partitioner information in javardd

2015-05-29 Thread joshrosen
Repository: spark
Updated Branches:
  refs/heads/master 1c5b19827 - 82a396c2f


[SPARK-7910] [TINY] [JAVAAPI] expose partitioner information in javardd

Author: Holden Karau hol...@pigscanfly.ca

Closes #6464 from holdenk/SPARK-7910-expose-partitioner-information-in-javardd 
and squashes the following commits:

de1e644 [Holden Karau] Fix the test to get the partitioner
bdb31cc [Holden Karau] Add Mima exclude for the new method
347ef4c [Holden Karau] Add a quick little test for the partitioner JavaAPI
f49dca9 [Holden Karau] Add partitoner information to JavaRDDLike and fix some 
whitespace


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/82a396c2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/82a396c2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/82a396c2

Branch: refs/heads/master
Commit: 82a396c2f594bade276606dcd0c0545a650fb838
Parents: 1c5b198
Author: Holden Karau hol...@pigscanfly.ca
Authored: Fri May 29 14:59:18 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Fri May 29 14:59:18 2015 -0700

--
 .../main/scala/org/apache/spark/api/java/JavaRDDLike.scala  | 9 ++---
 core/src/test/java/org/apache/spark/JavaAPISuite.java   | 2 ++
 project/MimaExcludes.scala  | 2 ++
 3 files changed, 10 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/82a396c2/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
index b8e15f3..c95615a 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
@@ -60,10 +60,13 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends 
Serializable {
 
   @deprecated(Use partitions() instead., 1.1.0)
   def splits: JList[Partition] = new java.util.ArrayList(rdd.partitions.toSeq)
-  
+
   /** Set of partitions in this RDD. */
   def partitions: JList[Partition] = new 
java.util.ArrayList(rdd.partitions.toSeq)
 
+  /** The partitioner of this RDD. */
+  def partitioner: Optional[Partitioner] = 
JavaUtils.optionToOptional(rdd.partitioner)
+
   /** The [[org.apache.spark.SparkContext]] that this RDD was created on. */
   def context: SparkContext = rdd.context
 
@@ -492,9 +495,9 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends 
Serializable {
 new java.util.ArrayList(arr)
   }
 
-  def takeSample(withReplacement: Boolean, num: Int): JList[T] = 
+  def takeSample(withReplacement: Boolean, num: Int): JList[T] =
 takeSample(withReplacement, num, Utils.random.nextLong)
-
+
   def takeSample(withReplacement: Boolean, num: Int, seed: Long): JList[T] = {
 import scala.collection.JavaConversions._
 val arr: java.util.Collection[T] = rdd.takeSample(withReplacement, num, 
seed).toSeq

http://git-wip-us.apache.org/repos/asf/spark/blob/82a396c2/core/src/test/java/org/apache/spark/JavaAPISuite.java
--
diff --git a/core/src/test/java/org/apache/spark/JavaAPISuite.java 
b/core/src/test/java/org/apache/spark/JavaAPISuite.java
index c2089b0..dfd86d3 100644
--- a/core/src/test/java/org/apache/spark/JavaAPISuite.java
+++ b/core/src/test/java/org/apache/spark/JavaAPISuite.java
@@ -212,6 +212,8 @@ public class JavaAPISuite implements Serializable {
 
 JavaPairRDDInteger, Integer repartitioned =
 rdd.repartitionAndSortWithinPartitions(partitioner);
+Assert.assertTrue(repartitioned.partitioner().isPresent());
+Assert.assertEquals(repartitioned.partitioner().get(), partitioner);
 ListListTuple2Integer, Integer partitions = 
repartitioned.glom().collect();
 Assert.assertEquals(partitions.get(0), Arrays.asList(new Tuple2Integer, 
Integer(0, 5),
 new Tuple2Integer, Integer(0, 8), new Tuple2Integer, Integer(2, 
6)));

http://git-wip-us.apache.org/repos/asf/spark/blob/82a396c2/project/MimaExcludes.scala
--
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 11b439e..8da72b3 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -38,6 +38,8 @@ object MimaExcludes {
   Seq(
 MimaBuild.excludeSparkPackage(deploy),
 MimaBuild.excludeSparkPackage(ml),
+// SPARK-7910 Adding a method to get the partioner to JavaRDD,
+
ProblemFilters.exclude[MissingMethodProblem](org.apache.spark.api.java.JavaRDDLike.partitioner),
 // SPARK-5922 Adding a generalized diff(other: RDD[(VertexId, 
VD)]) to 

spark git commit: [SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init

2015-05-29 Thread davies
Repository: spark
Updated Branches:
  refs/heads/master 82a396c2f - 5fb97dca9


[SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init

cc davies

Author: Shivaram Venkataraman shiva...@cs.berkeley.edu

Closes #6507 from shivaram/sparkr-init and squashes the following commits:

6fdd169 [Shivaram Venkataraman] Create SparkContext in sparkRSQL init


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5fb97dca
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5fb97dca
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5fb97dca

Branch: refs/heads/master
Commit: 5fb97dca9bcfc29ac33823554c8783997e811b99
Parents: 82a396c
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Authored: Fri May 29 15:08:30 2015 -0700
Committer: Davies Liu dav...@databricks.com
Committed: Fri May 29 15:08:30 2015 -0700

--
 R/pkg/R/sparkR.R | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5fb97dca/R/pkg/R/sparkR.R
--
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index 68387f0..5ced7c6 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -225,14 +225,21 @@ sparkR.init - function(
 #' sqlContext - sparkRSQL.init(sc)
 #'}
 
-sparkRSQL.init - function(jsc) {
+sparkRSQL.init - function(jsc = NULL) {
   if (exists(.sparkRSQLsc, envir = .sparkREnv)) {
 return(get(.sparkRSQLsc, envir = .sparkREnv))
   }
 
+  # If jsc is NULL, create a Spark Context
+  sc - if (is.null(jsc)) {
+sparkR.init()
+  } else {
+jsc
+  }
+
   sqlContext - callJStatic(org.apache.spark.sql.api.r.SQLUtils,
-createSQLContext,
-jsc)
+createSQLContext,
+sc)
   assign(.sparkRSQLsc, sqlContext, envir = .sparkREnv)
   sqlContext
 }
@@ -249,12 +256,19 @@ sparkRSQL.init - function(jsc) {
 #' sqlContext - sparkRHive.init(sc)
 #'}
 
-sparkRHive.init - function(jsc) {
+sparkRHive.init - function(jsc = NULL) {
   if (exists(.sparkRHivesc, envir = .sparkREnv)) {
 return(get(.sparkRHivesc, envir = .sparkREnv))
   }
 
-  ssc - callJMethod(jsc, sc)
+  # If jsc is NULL, create a Spark Context
+  sc - if (is.null(jsc)) {
+sparkR.init()
+  } else {
+jsc
+  }
+
+  ssc - callJMethod(sc, sc)
   hiveCtx - tryCatch({
 newJObject(org.apache.spark.sql.hive.HiveContext, ssc)
   }, error = function(err) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [HOTFIX] [SQL] Maven test compilation issue

2015-05-29 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master dbf8ff38d - 8c9979337


[HOTFIX] [SQL] Maven test compilation issue

Tests compile in SBT but not Maven.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8c997933
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8c997933
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8c997933

Branch: refs/heads/master
Commit: 8c9979337f193c72fd2f1a891909283de53777e3
Parents: dbf8ff3
Author: Andrew Or and...@databricks.com
Authored: Fri May 29 15:26:49 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Fri May 29 15:26:49 2015 -0700

--
 sql/core/pom.xml | 7 +++
 1 file changed, 7 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8c997933/sql/core/pom.xml
--
diff --git a/sql/core/pom.xml b/sql/core/pom.xml
index ffe95bb..8210c55 100644
--- a/sql/core/pom.xml
+++ b/sql/core/pom.xml
@@ -43,6 +43,13 @@
 /dependency
 dependency
   groupIdorg.apache.spark/groupId
+  artifactIdspark-core_${scala.binary.version}/artifactId
+  version${project.version}/version
+  typetest-jar/type
+  scopetest/scope
+/dependency
+dependency
+  groupIdorg.apache.spark/groupId
   artifactIdspark-catalyst_${scala.binary.version}/artifactId
   version${project.version}/version
 /dependency


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 7ed06c399 - 609c4923f


[SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike

This is a follow-up patch to #6441.

Author: Andrew Or and...@databricks.com

Closes #6510 from andrewor14/extends-funsuite-check and squashes the following 
commits:

6618b46 [Andrew Or] Exempt SparkSinkSuite from the FunSuite check
99d02ac [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
extends-funsuite-check
48874dd [Andrew Or] Guard against direct uses of FunSuite / FunSuiteLike


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/609c4923
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/609c4923
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/609c4923

Branch: refs/heads/master
Commit: 609c4923f98c188bce60ae35c1c8a08a8dfd95f1
Parents: 7ed06c3
Author: Andrew Or and...@databricks.com
Authored: Fri May 29 22:57:46 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 22:57:46 2015 -0700

--
 core/src/test/scala/org/apache/spark/SparkFunSuite.scala| 2 ++
 .../apache/spark/streaming/flume/sink/SparkSinkSuite.scala  | 9 +
 scalastyle-config.xml   | 7 +++
 3 files changed, 18 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/609c4923/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala 
b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
index 0327dfa..8cb3443 100644
--- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
+++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
@@ -17,12 +17,14 @@
 
 package org.apache.spark
 
+// scalastyle:off
 import org.scalatest.{FunSuite, Outcome}
 
 /**
  * Base abstract class for all unit tests in Spark for handling common 
functionality.
  */
 private[spark] abstract class SparkFunSuite extends FunSuite with Logging {
+// scalastyle:on
 
   /**
* Log the suite name and the test name before and after each test.

http://git-wip-us.apache.org/repos/asf/spark/blob/609c4923/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
--
diff --git 
a/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
 
b/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
index 650b2fb..605b3fe 100644
--- 
a/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
+++ 
b/external/flume-sink/src/test/scala/org/apache/spark/streaming/flume/sink/SparkSinkSuite.scala
@@ -31,9 +31,18 @@ import org.apache.flume.Context
 import org.apache.flume.channel.MemoryChannel
 import org.apache.flume.event.EventBuilder
 import org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory
+
+// Due to MNG-1378, there is not a way to include test dependencies 
transitively.
+// We cannot include Spark core tests as a dependency here because it depends 
on
+// Spark core main, which has too many dependencies to require here manually.
+// For this reason, we continue to use FunSuite and ignore the scalastyle 
checks
+// that fail if this is detected.
+//scalastyle:off
 import org.scalatest.FunSuite
 
 class SparkSinkSuite extends FunSuite {
+//scalastyle:on
+
   val eventsPerBatch = 1000
   val channelCapacity = 5000
 

http://git-wip-us.apache.org/repos/asf/spark/blob/609c4923/scalastyle-config.xml
--
diff --git a/scalastyle-config.xml b/scalastyle-config.xml
index 68c8ce3..890bf37 100644
--- a/scalastyle-config.xml
+++ b/scalastyle-config.xml
@@ -153,4 +153,11 @@
 /parameters
   /check
   check level=error 
class=org.scalastyle.scalariform.NotImplementedErrorUsage 
enabled=true/check
+  !-- As of SPARK-7558, all tests in Spark should extend o.a.s.SparkFunSuite 
instead of FunSuited directly --
+  check level=error class=org.scalastyle.scalariform.TokenChecker 
enabled=true
+   parameters
+parameter name=regex^FunSuite[A-Za-z]*$/parameter
+   /parameters
+   customMessageTests must extend org.apache.spark.SparkFunSuite 
instead./customMessage
+  /check
 /scalastyle


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [DOCS][Tiny] Added a missing dash(-) in docs/configuration.md

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master a4f24123d - 3792d2583


[DOCS][Tiny] Added a missing dash(-) in docs/configuration.md

The first line had only two dashes (--) instead of three(---). Because of this 
missing dash(-), 'jekyll build' command was not converting configuration.md to 
_site/configuration.html

Author: Taka Shinagawa taka.epsi...@gmail.com

Closes #6513 from mrt/docfix3 and squashes the following commits:

c470e2c [Taka Shinagawa] Added a missing dash(-) preventing jekyll from 
converting configuration.md to html format


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3792d258
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3792d258
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3792d258

Branch: refs/heads/master
Commit: 3792d25836e1e521da64c5a62ca1b6cca1bcb6b9
Parents: a4f2412
Author: Taka Shinagawa taka.epsi...@gmail.com
Authored: Fri May 29 20:35:14 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 20:35:14 2015 -0700

--
 docs/configuration.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/3792d258/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index 30508a6..3a48da4 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1,4 +1,4 @@
---
+---
 layout: global
 displayTitle: Spark Configuration
 title: Configuration


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7957] Preserve partitioning when using randomSplit

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 3792d2583 - 7ed06c399


[SPARK-7957] Preserve partitioning when using randomSplit

cc JoshRosen
Thanks for noticing this!

Author: Burak Yavuz brk...@gmail.com

Closes #6509 from brkyvz/sample-perf-reg and squashes the following commits:

497465d [Burak Yavuz] addressed code review
293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7ed06c39
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7ed06c39
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7ed06c39

Branch: refs/heads/master
Commit: 7ed06c39922ac90acab3a78ce0f2f21184ed68a5
Parents: 3792d25
Author: Burak Yavuz brk...@gmail.com
Authored: Fri May 29 22:19:15 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 22:19:15 2015 -0700

--
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7ed06c39/core/src/main/scala/org/apache/spark/rdd/RDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 5fcef25..10610f4 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -434,11 +434,11 @@ abstract class RDD[T: ClassTag](
* @return A random sub-sample of the RDD without replacement.
*/
   private[spark] def randomSampleWithRange(lb: Double, ub: Double, seed: 
Long): RDD[T] = {
-this.mapPartitionsWithIndex { case (index, partition) =
+this.mapPartitionsWithIndex( { (index, partition) =
   val sampler = new BernoulliCellSampler[T](lb, ub)
   sampler.setSeed(seed + index)
   sampler.sample(partition)
-}
+}, preservesPartitioning = true)
   }
 
   /**


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7957] Preserve partitioning when using randomSplit

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 400e6dbce - 1513cffa3


[SPARK-7957] Preserve partitioning when using randomSplit

cc JoshRosen
Thanks for noticing this!

Author: Burak Yavuz brk...@gmail.com

Closes #6509 from brkyvz/sample-perf-reg and squashes the following commits:

497465d [Burak Yavuz] addressed code review
293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit

(cherry picked from commit 7ed06c39922ac90acab3a78ce0f2f21184ed68a5)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1513cffa
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1513cffa
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1513cffa

Branch: refs/heads/branch-1.4
Commit: 1513cffa35d520c2d4b620399944b19888d88fc2
Parents: 400e6db
Author: Burak Yavuz brk...@gmail.com
Authored: Fri May 29 22:19:15 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 22:19:23 2015 -0700

--
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1513cffa/core/src/main/scala/org/apache/spark/rdd/RDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 5fcef25..10610f4 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -434,11 +434,11 @@ abstract class RDD[T: ClassTag](
* @return A random sub-sample of the RDD without replacement.
*/
   private[spark] def randomSampleWithRange(lb: Double, ub: Double, seed: 
Long): RDD[T] = {
-this.mapPartitionsWithIndex { case (index, partition) =
+this.mapPartitionsWithIndex( { (index, partition) =
   val sampler = new BernoulliCellSampler[T](lb, ub)
   sampler.setSeed(seed + index)
   sampler.sample(partition)
-}
+}, preservesPartitioning = true)
   }
 
   /**


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [DOCS][Tiny] Added a missing dash(-) in docs/configuration.md

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 9a88be183 - 400e6dbce


[DOCS][Tiny] Added a missing dash(-) in docs/configuration.md

The first line had only two dashes (--) instead of three(---). Because of this 
missing dash(-), 'jekyll build' command was not converting configuration.md to 
_site/configuration.html

Author: Taka Shinagawa taka.epsi...@gmail.com

Closes #6513 from mrt/docfix3 and squashes the following commits:

c470e2c [Taka Shinagawa] Added a missing dash(-) preventing jekyll from 
converting configuration.md to html format

(cherry picked from commit 3792d25836e1e521da64c5a62ca1b6cca1bcb6b9)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/400e6dbc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/400e6dbc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/400e6dbc

Branch: refs/heads/branch-1.4
Commit: 400e6dbce2f8e62c6e6dc7d5bd82a445a40100b7
Parents: 9a88be1
Author: Taka Shinagawa taka.epsi...@gmail.com
Authored: Fri May 29 20:35:14 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 20:35:26 2015 -0700

--
 docs/configuration.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/400e6dbc/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index 30508a6..3a48da4 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1,4 +1,4 @@
---
+---
 layout: global
 displayTitle: Spark Configuration
 title: Configuration


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [created] fb60503ff

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[6/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
[SPARK-7558] Demarcate tests in unit-tests.log

Right now `unit-tests.log` are not of much value because we can't tell where 
the test boundaries are easily. This patch adds log statements before and after 
each test to outline the test boundaries, e.g.:

```
= TEST OUTPUT FOR o.a.s.serializer.KryoSerializerSuite: 'kryo with 
parallelize for primitive arrays' =

15/05/27 12:36:39.596 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite 
INFO SparkContext: Starting job: count at KryoSerializerSuite.scala:230
15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Got job 3 
(count at KryoSerializerSuite.scala:230) with 4 output partitions 
(allowLocal=false)
15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Final stage: 
ResultStage 3(count at KryoSerializerSuite.scala:230)
15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Parents of 
final stage: List()
15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Missing 
parents: List()
15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Submitting 
ResultStage 3 (ParallelCollectionRDD[5] at parallelize at 
KryoSerializerSuite.scala:230), which has no missing parents

...

15/05/27 12:36:39.624 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite 
INFO DAGScheduler: Job 3 finished: count at KryoSerializerSuite.scala:230, took 
0.028563 s
15/05/27 12:36:39.625 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite 
INFO KryoSerializerSuite:

* FINISHED o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for 
primitive arrays' *

...
```

Author: Andrew Or and...@databricks.com

Closes #6441 from andrewor14/demarcate-tests and squashes the following commits:

879b060 [Andrew Or] Fix compile after rebase
d622af7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
demarcate-tests
017c8ba [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
demarcate-tests
7790b6c [Andrew Or] Fix tests after logical merge conflict
c7460c0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
demarcate-tests
c43ffc4 [Andrew Or] Fix tests?
8882581 [Andrew Or] Fix tests
ee22cda [Andrew Or] Fix log message
fa9450e [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
demarcate-tests
12d1e1b [Andrew Or] Various whitespace changes (minor)
69cbb24 [Andrew Or] Make all test suites extend SparkFunSuite instead of 
FunSuite
bbce12e [Andrew Or] Fix manual things that cannot be covered through automation
da0b12f [Andrew Or] Add core tests as dependencies in all modules
f7d29ce [Andrew Or] Introduce base abstract class for all test suites


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9eb222c1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9eb222c1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9eb222c1

Branch: refs/heads/master
Commit: 9eb222c13991c2b4a22db485710dc2e27ccf06dd
Parents: 94f62a4
Author: Andrew Or and...@databricks.com
Authored: Fri May 29 14:03:12 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Fri May 29 14:03:12 2015 -0700

--
 bagel/pom.xml   |  7 +++
 .../org/apache/spark/bagel/BagelSuite.scala |  4 +-
 core/pom.xml|  6 +++
 .../org/apache/spark/AccumulatorSuite.scala |  3 +-
 .../org/apache/spark/CacheManagerSuite.scala|  4 +-
 .../org/apache/spark/CheckpointSuite.scala  |  4 +-
 .../org/apache/spark/ContextCleanerSuite.scala  |  4 +-
 .../org/apache/spark/DistributedSuite.scala |  3 +-
 .../scala/org/apache/spark/DriverSuite.scala|  3 +-
 .../spark/ExecutorAllocationManagerSuite.scala  |  8 +++-
 .../scala/org/apache/spark/FailureSuite.scala   |  4 +-
 .../org/apache/spark/FileServerSuite.scala  |  3 +-
 .../test/scala/org/apache/spark/FileSuite.scala |  3 +-
 .../org/apache/spark/FutureActionSuite.scala|  8 +++-
 .../apache/spark/HeartbeatReceiverSuite.scala   |  3 +-
 .../apache/spark/ImplicitOrderingSuite.scala|  4 +-
 .../org/apache/spark/JobCancellationSuite.scala |  4 +-
 .../apache/spark/MapOutputTrackerSuite.scala|  3 +-
 .../org/apache/spark/PartitioningSuite.scala|  4 +-
 .../org/apache/spark/SSLOptionsSuite.scala  |  4 +-
 .../org/apache/spark/SecurityManagerSuite.scala |  4 +-
 .../scala/org/apache/spark/ShuffleSuite.scala   |  3 +-
 .../scala/org/apache/spark/SparkConfSuite.scala |  3 +-
 .../apache/spark/SparkContextInfoSuite.scala|  4 +-
 .../SparkContextSchedulerCreationSuite.scala|  4 +-
 .../org/apache/spark/SparkContextSuite.scala|  4 +-
 .../scala/org/apache/spark/SparkFunSuite.scala  | 46 
 .../org/apache/spark/StatusTrackerSuite.scala   |  4 +-
 .../scala/org/apache/spark/ThreadingSuite.scala |  3 +-
 .../scala/org/apache/spark/UnpersistSuite.scala |  3 +-
 

[4/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
index cb99d14..a2a6d70 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
@@ -21,9 +21,9 @@ import java.util.Comparator
 
 import scala.collection.mutable.HashSet
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class AppendOnlyMapSuite extends FunSuite {
+class AppendOnlyMapSuite extends SparkFunSuite {
   test(initialization) {
 val goodMap1 = new AppendOnlyMap[Int, Int](1)
 assert(goodMap1.size === 0)

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
index ffc2069..69dbfa9 100644
--- a/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
+++ b/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
@@ -17,9 +17,9 @@
 
 package org.apache.spark.util.collection
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class BitSetSuite extends FunSuite {
+class BitSetSuite extends SparkFunSuite {
 
   test(basic set and get) {
 val setBits = Seq(0, 9, 1, 10, 90, 96)

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
index c0c38cd..05306f4 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
@@ -19,10 +19,11 @@ package org.apache.spark.util.collection
 
 import java.nio.ByteBuffer
 
-import org.scalatest.FunSuite
 import org.scalatest.Matchers._
 
-class ChainedBufferSuite extends FunSuite {
+import org.apache.spark.SparkFunSuite
+
+class ChainedBufferSuite extends SparkFunSuite {
   test(write and read at start) {
 // write from start of source array
 val buffer = new ChainedBuffer(8)

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
index 6c956d9..bc54799 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
@@ -17,9 +17,9 @@
 
 package org.apache.spark.util.collection
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class CompactBufferSuite extends FunSuite {
+class CompactBufferSuite extends SparkFunSuite {
   test(empty buffer) {
 val b = new CompactBuffer[Int]
 assert(b.size === 0)

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
 
b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
index dff8f3d..79eba61 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
@@ -19,12 +19,10 @@ package org.apache.spark.util.collection
 
 import scala.collection.mutable.ArrayBuffer
 
-import org.scalatest.FunSuite
-
 import org.apache.spark._
 import org.apache.spark.io.CompressionCodec
 
-class ExternalAppendOnlyMapSuite extends FunSuite with LocalSparkContext {
+class ExternalAppendOnlyMapSuite extends SparkFunSuite with LocalSparkContext {
   private val allCompressionCodecs = CompressionCodec.ALL_COMPRESSION_CODECS
   private def createCombiner[T](i: T) = ArrayBuffer[T](i)
   private def mergeValue[T](buffer: ArrayBuffer[T], i: T): ArrayBuffer[T] = 
buffer += i


[1/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 94f62a497 - 9eb222c13


http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
index b343cbb..7509000 100644
--- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
+++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
@@ -26,13 +26,13 @@ import org.apache.hadoop.yarn.api.records._
 import org.apache.hadoop.yarn.client.api.AMRMClient
 import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest
 
-import org.apache.spark.SecurityManager
+import org.apache.spark.{SecurityManager, SparkFunSuite}
 import org.apache.spark.SparkConf
 import org.apache.spark.deploy.yarn.YarnSparkHadoopUtil._
 import org.apache.spark.deploy.yarn.YarnAllocator._
 import org.apache.spark.scheduler.SplitInfo
 
-import org.scalatest.{BeforeAndAfterEach, FunSuite, Matchers}
+import org.scalatest.{BeforeAndAfterEach, Matchers}
 
 class MockResolver extends DNSToSwitchMapping {
 
@@ -46,7 +46,7 @@ class MockResolver extends DNSToSwitchMapping {
   def reloadCachedMappings(names: JList[String]) {}
 }
 
-class YarnAllocatorSuite extends FunSuite with Matchers with 
BeforeAndAfterEach {
+class YarnAllocatorSuite extends SparkFunSuite with Matchers with 
BeforeAndAfterEach {
   val conf = new Configuration()
   conf.setClass(
 CommonConfigurationKeysPublic.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY,

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
index dcaeb2e..d8bc253 100644
--- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
+++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
@@ -30,9 +30,9 @@ import com.google.common.io.ByteStreams
 import com.google.common.io.Files
 import org.apache.hadoop.yarn.conf.YarnConfiguration
 import org.apache.hadoop.yarn.server.MiniYARNCluster
-import org.scalatest.{BeforeAndAfterAll, FunSuite, Matchers}
+import org.scalatest.{BeforeAndAfterAll, Matchers}
 
-import org.apache.spark.{Logging, SparkConf, SparkContext, SparkException, 
TestUtils}
+import org.apache.spark._
 import org.apache.spark.scheduler.cluster.ExecutorInfo
 import org.apache.spark.scheduler.{SparkListener, 
SparkListenerApplicationStart,
   SparkListenerExecutorAdded}
@@ -43,7 +43,7 @@ import org.apache.spark.util.Utils
  * applications, and require the Spark assembly to be built before they can be 
successfully
  * run.
  */
-class YarnClusterSuite extends FunSuite with BeforeAndAfterAll with Matchers 
with Logging {
+class YarnClusterSuite extends SparkFunSuite with BeforeAndAfterAll with 
Matchers with Logging {
 
   // log4j configuration for the YARN containers, so that their output is 
collected
   // by YARN instead of trying to overwrite unit-tests.log.

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala
 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala
index e10b985..49bee08 100644
--- 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala
+++ 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtilSuite.scala
@@ -25,15 +25,15 @@ import org.apache.hadoop.fs.Path
 import org.apache.hadoop.yarn.api.ApplicationConstants
 import org.apache.hadoop.yarn.api.ApplicationConstants.Environment
 import org.apache.hadoop.yarn.conf.YarnConfiguration
-import org.scalatest.{FunSuite, Matchers}
+import org.scalatest.Matchers
 
 import org.apache.hadoop.yarn.api.records.ApplicationAccessType
 
-import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkException}
+import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkException, 
SparkFunSuite}
 import org.apache.spark.util.Utils
 
 
-class YarnSparkHadoopUtilSuite extends FunSuite with Matchers with Logging {
+class YarnSparkHadoopUtilSuite extends SparkFunSuite with Matchers with 
Logging {
 
   val hasBash =
 try {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[2/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
--
diff --git 
a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 14f5e9e..9ecc7c2 100644
--- a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -24,14 +24,13 @@ import scala.collection.mutable.ArrayBuffer
 import scala.concurrent.duration._
 import scala.tools.nsc.interpreter.SparkILoop
 
-import org.scalatest.FunSuite
 import org.apache.commons.lang3.StringEscapeUtils
-import org.apache.spark.SparkContext
+import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.util.Utils
 
 
 
-class ReplSuite extends FunSuite {
+class ReplSuite extends SparkFunSuite {
 
   def runInterpreter(master: String, input: String): String = {
 val CONF_EXECUTOR_CLASSPATH = spark.executor.extraClassPath

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
--
diff --git 
a/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala 
b/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
index c709cde..a58eda1 100644
--- a/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
+++ b/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
@@ -25,7 +25,6 @@ import scala.language.implicitConversions
 import scala.language.postfixOps
 
 import org.scalatest.BeforeAndAfterAll
-import org.scalatest.FunSuite
 import org.scalatest.concurrent.Interruptor
 import org.scalatest.concurrent.Timeouts._
 import org.scalatest.mock.MockitoSugar
@@ -35,7 +34,7 @@ import org.apache.spark._
 import org.apache.spark.util.Utils
 
 class ExecutorClassLoaderSuite
-  extends FunSuite
+  extends SparkFunSuite
   with BeforeAndAfterAll
   with MockitoSugar
   with Logging {

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/sql/catalyst/pom.xml
--
diff --git a/sql/catalyst/pom.xml b/sql/catalyst/pom.xml
index 5c322d0..d9e1cdb 100644
--- a/sql/catalyst/pom.xml
+++ b/sql/catalyst/pom.xml
@@ -52,6 +52,13 @@
 /dependency
 dependency
   groupIdorg.apache.spark/groupId
+  artifactIdspark-core_${scala.binary.version}/artifactId
+  version${project.version}/version
+  typetest-jar/type
+  scopetest/scope
+/dependency
+dependency
+  groupIdorg.apache.spark/groupId
   artifactIdspark-unsafe_${scala.binary.version}/artifactId
   version${project.version}/version
 /dependency

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
index ea82cd2..c046dbf 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
@@ -17,14 +17,13 @@
 
 package org.apache.spark.sql.catalyst
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.sql.catalyst.plans.physical._
 
 /* Implicit conversions */
 import org.apache.spark.sql.catalyst.dsl.expressions._
 
-class DistributionSuite extends FunSuite {
+class DistributionSuite extends SparkFunSuite {
 
   protected def checkSatisfied(
   inputPartitioning: Partitioning,

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
index 7ff51db..9a24b23 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
@@ -20,8 +20,7 @@ package org.apache.spark.sql.catalyst
 import java.math.BigInteger
 import java.sql.{Date, Timestamp}
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.sql.catalyst.expressions.Row
 import org.apache.spark.sql.types._
 
@@ -75,7 +74,7 @@ case class MultipleConstructorsData(a: Int, b: String, c: 
Double) {
   def this(b: String, a: Int) = this(a, 

[5/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
index 46d2e51..3940527 100644
--- 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
@@ -31,12 +31,12 @@ import org.apache.spark.network.buffer.{ManagedBuffer, 
NioManagedBuffer}
 import org.apache.spark.network.shuffle.BlockFetchingListener
 import org.apache.spark.network.{BlockDataManager, BlockTransferService}
 import org.apache.spark.storage.{BlockId, ShuffleBlockId}
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.mockito.Mockito._
 import org.scalatest.mock.MockitoSugar
-import org.scalatest.{FunSuite, ShouldMatchers}
+import org.scalatest.ShouldMatchers
 
-class NettyBlockTransferSecuritySuite extends FunSuite with MockitoSugar with 
ShouldMatchers {
+class NettyBlockTransferSecuritySuite extends SparkFunSuite with MockitoSugar 
with ShouldMatchers {
   test(security default off) {
 val conf = new SparkConf()
   .set(spark.app.id, app-id)

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
index a41f8b7..6f8e8a7 100644
--- 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
@@ -18,11 +18,15 @@
 package org.apache.spark.network.netty
 
 import org.apache.spark.network.BlockDataManager
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.mockito.Mockito.mock
 import org.scalatest._
 
-class NettyBlockTransferServiceSuite extends FunSuite with BeforeAndAfterEach 
with ShouldMatchers {
+class NettyBlockTransferServiceSuite
+  extends SparkFunSuite
+  with BeforeAndAfterEach
+  with ShouldMatchers {
+
   private var service0: NettyBlockTransferService = _
   private var service1: NettyBlockTransferService = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala 
b/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
index 02424c5..5e364cc 100644
--- 
a/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
@@ -24,15 +24,13 @@ import scala.concurrent.duration._
 import scala.concurrent.{Await, TimeoutException}
 import scala.language.postfixOps
 
-import org.scalatest.FunSuite
-
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.apache.spark.util.Utils
 
 /**
   * Test the ConnectionManager with various security settings.
   */
-class ConnectionManagerSuite extends FunSuite {
+class ConnectionManagerSuite extends SparkFunSuite {
 
   test(security default off) {
 val conf = new SparkConf

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
index f2b0ea1..ec99f2a 100644
--- a/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
@@ -23,13 +23,13 @@ import scala.concurrent.{Await, TimeoutException}
 import scala.concurrent.duration.Duration
 import scala.concurrent.ExecutionContext.Implicits.global
 
-import org.scalatest.{BeforeAndAfterAll, FunSuite}
+import org.scalatest.BeforeAndAfterAll
 import org.scalatest.concurrent.Timeouts
 import org.scalatest.time.SpanSugar._
 
-import org.apache.spark.{SparkContext, SparkException, LocalSparkContext}
+import org.apache.spark.{LocalSparkContext, SparkContext, SparkException, 
SparkFunSuite}
 
-class AsyncRDDActionsSuite 

[3/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log

2015-05-29 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
index 3efffbb..7891156 100644
--- 
a/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
@@ -17,8 +17,7 @@
 
 package org.apache.spark.ml.regression
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.ml.impl.TreeTests
 import org.apache.spark.mllib.regression.LabeledPoint
 import org.apache.spark.mllib.tree.{EnsembleTestHelper, RandomForest = 
OldRandomForest}
@@ -31,7 +30,7 @@ import org.apache.spark.sql.DataFrame
 /**
  * Test suite for [[RandomForestRegressor]].
  */
-class RandomForestRegressorSuite extends FunSuite with MLlibTestSparkContext {
+class RandomForestRegressorSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 
   import RandomForestRegressorSuite.compareAPIs
 
@@ -98,7 +97,7 @@ class RandomForestRegressorSuite extends FunSuite with 
MLlibTestSparkContext {
   */
 }
 
-private object RandomForestRegressorSuite extends FunSuite {
+private object RandomForestRegressorSuite extends SparkFunSuite {
 
   /**
* Train 2 models on the given dataset, one using the old API and one using 
the new API.

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
index 60d8bfe..5ba469c 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
@@ -17,7 +17,7 @@
 
 package org.apache.spark.ml.tuning
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
 import org.apache.spark.ml.{Estimator, Model}
 import org.apache.spark.ml.classification.LogisticRegression
@@ -29,7 +29,7 @@ import org.apache.spark.mllib.util.MLlibTestSparkContext
 import org.apache.spark.sql.{DataFrame, SQLContext}
 import org.apache.spark.sql.types.StructType
 
-class CrossValidatorSuite extends FunSuite with MLlibTestSparkContext {
+class CrossValidatorSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   @transient var dataset: DataFrame = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
index 20aa100..810b700 100644
--- 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
@@ -19,11 +19,10 @@ package org.apache.spark.ml.tuning
 
 import scala.collection.mutable
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.ml.param.{ParamMap, TestParams}
 
-class ParamGridBuilderSuite extends FunSuite {
+class ParamGridBuilderSuite extends SparkFunSuite {
 
   val solver = new TestParams()
   import solver.{inputCol, maxIter}

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
index 3d362b5..5994441 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/api/python/PythonMLLibAPISuite.scala
@@ -17,13 +17,12 @@
 
 package org.apache.spark.mllib.api.python
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.{DenseMatrix, Matrices, Vectors, 
SparseMatrix}
 import org.apache.spark.mllib.regression.LabeledPoint
 import org.apache.spark.mllib.recommendation.Rating
 
-class PythonMLLibAPISuite extends FunSuite {
+class PythonMLLibAPISuite extends SparkFunSuite {
 
   SerDe.initialize()
 

http://git-wip-us.apache.org/repos/asf/spark/blob/9eb222c1/mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala

spark git commit: Revert [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior

2015-05-29 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 c68abaa34 - 18811ca20


Revert [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in 
HiveThriftServer2Test to ensure expected logging behavior

This reverts commit 645e611644be3f62ef07e4ca7628bf298349d9a6.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/18811ca2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/18811ca2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/18811ca2

Branch: refs/heads/branch-1.4
Commit: 18811ca20bc3c9c32b6bfbefb3d20092b7889ca8
Parents: c68abaa
Author: Patrick Wendell patr...@databricks.com
Authored: Fri May 29 13:03:52 2015 -0700
Committer: Patrick Wendell patr...@databricks.com
Committed: Fri May 29 13:03:52 2015 -0700

--
 .../thriftserver/HiveThriftServer2Suites.scala  | 31 
 1 file changed, 6 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/18811ca2/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
--
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
index 610939c..1fadea9 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
@@ -19,8 +19,6 @@ package org.apache.spark.sql.hive.thriftserver
 
 import java.io.File
 import java.net.URL
-import java.nio.charset.StandardCharsets
-import java.nio.file.{Files, Paths}
 import java.sql.{Date, DriverManager, Statement}
 
 import scala.collection.mutable.ArrayBuffer
@@ -56,7 +54,7 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest {
   override def mode: ServerMode.Value = ServerMode.binary
 
   private def withCLIServiceClient(f: ThriftCLIServiceClient = Unit): Unit = {
-// Transport creation logic below mimics 
HiveConnection.createBinaryTransport
+// Transport creation logics below mimics 
HiveConnection.createBinaryTransport
 val rawTransport = new TSocket(localhost, serverPort)
 val user = System.getProperty(user.name)
 val transport = PlainSaslHelper.getPlainTransport(user, anonymous, 
rawTransport)
@@ -393,10 +391,10 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 val statements = connections.map(_.createStatement())
 
 try {
-  statements.zip(fs).foreach { case (s, f) = f(s) }
+  statements.zip(fs).map { case (s, f) = f(s) }
 } finally {
-  statements.foreach(_.close())
-  connections.foreach(_.close())
+  statements.map(_.close())
+  connections.map(_.close())
 }
   }
 
@@ -435,32 +433,15 @@ abstract class HiveThriftServer2Test extends FunSuite 
with BeforeAndAfterAll wit
   ConfVars.HIVE_SERVER2_THRIFT_HTTP_PORT
 }
 
-val driverClassPath = {
-  // Writes a temporary log4j.properties and prepend it to driver 
classpath, so that it
-  // overrides all other potential log4j configurations contained in other 
dependency jar files.
-  val tempLog4jConf = Utils.createTempDir().getCanonicalPath
-
-  Files.write(
-Paths.get(s$tempLog4jConf/log4j.properties),
-log4j.rootCategory=INFO, console
-  |log4j.appender.console=org.apache.log4j.ConsoleAppender
-  |log4j.appender.console.target=System.err
-  |log4j.appender.console.layout=org.apache.log4j.PatternLayout
-  |log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd 
HH:mm:ss} %p %c{1}: %m%n
-.stripMargin.getBytes(StandardCharsets.UTF_8))
-
-  tempLog4jConf + File.pathSeparator + sys.props(java.class.path)
-}
-
 s$startScript
|  --master local
+   |  --hiveconf hive.root.logger=INFO,console
|  --hiveconf ${ConfVars.METASTORECONNECTURLKEY}=$metastoreJdbcUri
|  --hiveconf ${ConfVars.METASTOREWAREHOUSE}=$warehousePath
|  --hiveconf ${ConfVars.HIVE_SERVER2_THRIFT_BIND_HOST}=localhost
|  --hiveconf ${ConfVars.HIVE_SERVER2_TRANSPORT_MODE}=$mode
|  --hiveconf $portConf=$port
-   |  --driver-class-path $driverClassPath
-   |  --driver-java-options -Dlog4j.debug
+   |  --driver-class-path ${sys.props(java.class.path)}
|  --conf spark.ui.enabled=false
  .stripMargin.split(\\s+).toSeq
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [deleted] fb60503ff

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Git Push Summary

2015-05-29 Thread pwendell
Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [created] dd109a874

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[2/2] spark git commit: Preparing development version 1.4.0-SNAPSHOT

2015-05-29 Thread pwendell
Preparing development version 1.4.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e549874c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e549874c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e549874c

Branch: refs/heads/branch-1.4
Commit: e549874c33aca1073c223451e82d91a1577a4132
Parents: dd109a8
Author: Patrick Wendell pwend...@gmail.com
Authored: Fri May 29 13:07:07 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Fri May 29 13:07:07 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index b8a821d..626c857 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index c1aa32b..1f3dec9 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a9b8b42..e58efe4 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 38ff67d..e4efee7 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index e8784eb..1f3e619 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e549874c/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 1794f3e..8df7edb 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+

spark git commit: [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 e549874c3 - f40605f06


[SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, 
LARROW, RARROW in style checker.

…

Author: Reynold Xin r...@databricks.com

Closes #6491 from rxin/more-whitespace and squashes the following commits:

f6e63dc [Reynold Xin] [SPARK-7940] Enforce whitespace checking for DO, TRY, 
CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.

(cherry picked from commit 94f62a4979e4bc5f7bf4f5852d76977e097209e6)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f40605f0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f40605f0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f40605f0

Branch: refs/heads/branch-1.4
Commit: f40605f064c1b3a3415afb65707004250e963c97
Parents: e549874
Author: Reynold Xin r...@databricks.com
Authored: Fri May 29 13:38:37 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 13:39:02 2015 -0700

--
 .../main/scala/org/apache/spark/network/nio/BlockMessage.scala  | 2 +-
 .../main/scala/org/apache/spark/network/nio/Connection.scala| 5 ++---
 .../scala/org/apache/spark/network/nio/ConnectionManager.scala  | 5 ++---
 .../scala/org/apache/spark/rdd/PartitionerAwareUnionRDD.scala   | 2 +-
 .../src/main/scala/org/apache/spark/mllib/tree/model/Node.scala | 2 +-
 .../spark/mllib/classification/LogisticRegressionSuite.scala| 4 ++--
 scalastyle-config.xml   | 4 ++--
 .../src/main/scala/org/apache/spark/sql/types/UTF8String.scala  | 2 +-
 sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala | 2 +-
 9 files changed, 13 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f40605f0/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala 
b/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
index 1a92a79..67a3761 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
@@ -155,7 +155,7 @@ private[nio] class BlockMessage() {
 
   override def toString: String = {
 BlockMessage [type =  + typ + , id =  + id + , level =  + level +
-, data =  + (if (data != null) data.remaining.toString  else null) + 
]
+, data =  + (if (data != null) data.remaining.toString else null) + ]
   }
 }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f40605f0/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
--
diff --git a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala 
b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
index 6b898bd..1499da0 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
@@ -326,15 +326,14 @@ class SendingConnection(val address: InetSocketAddress, 
selector_ : Selector,
 
   // MUST be called within the selector loop
   def connect() {
-try{
+try {
   channel.register(selector, SelectionKey.OP_CONNECT)
   channel.connect(address)
   logInfo(Initiating connection to [ + address + ])
 } catch {
-  case e: Exception = {
+  case e: Exception =
 logError(Error connecting to  + address, e)
 callOnExceptionCallbacks(e)
-  }
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f40605f0/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala 
b/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
index 497871e..c0bca2c 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
@@ -635,12 +635,11 @@ private[nio] class ConnectionManager(
 val message = securityMsgResp.toBufferMessage
 if (message == null) throw new IOException(Error creating security 
message)
 sendSecurityMessage(waitingConn.getRemoteConnectionManagerId(), 
message)
-  } catch  {
-case e: Exception = {
+  } catch {
+case e: Exception =
   logError(Error handling sasl client authentication, e)
   waitingConn.close()
   throw new IOException(Error evaluating sasl response: , e)
-}
   }
 }
   }


spark git commit: [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.

2015-05-29 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 6181937f3 - 94f62a497


[SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, 
LARROW, RARROW in style checker.

…

Author: Reynold Xin r...@databricks.com

Closes #6491 from rxin/more-whitespace and squashes the following commits:

f6e63dc [Reynold Xin] [SPARK-7940] Enforce whitespace checking for DO, TRY, 
CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/94f62a49
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/94f62a49
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/94f62a49

Branch: refs/heads/master
Commit: 94f62a4979e4bc5f7bf4f5852d76977e097209e6
Parents: 6181937
Author: Reynold Xin r...@databricks.com
Authored: Fri May 29 13:38:37 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Fri May 29 13:38:37 2015 -0700

--
 .../main/scala/org/apache/spark/network/nio/BlockMessage.scala  | 2 +-
 .../main/scala/org/apache/spark/network/nio/Connection.scala| 5 ++---
 .../scala/org/apache/spark/network/nio/ConnectionManager.scala  | 5 ++---
 .../scala/org/apache/spark/rdd/PartitionerAwareUnionRDD.scala   | 2 +-
 .../src/main/scala/org/apache/spark/mllib/tree/model/Node.scala | 2 +-
 .../spark/mllib/classification/LogisticRegressionSuite.scala| 4 ++--
 scalastyle-config.xml   | 4 ++--
 .../src/main/scala/org/apache/spark/sql/types/UTF8String.scala  | 2 +-
 sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala | 2 +-
 9 files changed, 13 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/94f62a49/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala 
b/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
index 1a92a79..67a3761 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/BlockMessage.scala
@@ -155,7 +155,7 @@ private[nio] class BlockMessage() {
 
   override def toString: String = {
 BlockMessage [type =  + typ + , id =  + id + , level =  + level +
-, data =  + (if (data != null) data.remaining.toString  else null) + 
]
+, data =  + (if (data != null) data.remaining.toString else null) + ]
   }
 }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/94f62a49/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
--
diff --git a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala 
b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
index 6b898bd..1499da0 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
@@ -326,15 +326,14 @@ class SendingConnection(val address: InetSocketAddress, 
selector_ : Selector,
 
   // MUST be called within the selector loop
   def connect() {
-try{
+try {
   channel.register(selector, SelectionKey.OP_CONNECT)
   channel.connect(address)
   logInfo(Initiating connection to [ + address + ])
 } catch {
-  case e: Exception = {
+  case e: Exception =
 logError(Error connecting to  + address, e)
 callOnExceptionCallbacks(e)
-  }
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/94f62a49/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala 
b/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
index 497871e..c0bca2c 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
@@ -635,12 +635,11 @@ private[nio] class ConnectionManager(
 val message = securityMsgResp.toBufferMessage
 if (message == null) throw new IOException(Error creating security 
message)
 sendSecurityMessage(waitingConn.getRemoteConnectionManagerId(), 
message)
-  } catch  {
-case e: Exception = {
+  } catch {
+case e: Exception =
   logError(Error handling sasl client authentication, e)
   waitingConn.close()
   throw new IOException(Error evaluating sasl response: , e)
-}
   }
 }
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/94f62a49/core/src/main/scala/org/apache/spark/rdd/PartitionerAwareUnionRDD.scala

[2/2] spark git commit: Preparing development version 1.4.0-SNAPSHOT

2015-05-29 Thread pwendell
Preparing development version 1.4.0-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c68abaa3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c68abaa3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c68abaa3

Branch: refs/heads/branch-1.4
Commit: c68abaa34ed1f4de8af5c72f67ebd14478c52220
Parents: fb60503
Author: Patrick Wendell pwend...@gmail.com
Authored: Fri May 29 12:15:18 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Fri May 29 12:15:18 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index b8a821d..626c857 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index c1aa32b..1f3dec9 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a9b8b42..e58efe4 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 38ff67d..e4efee7 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index e8784eb..1f3e619 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+version1.4.0-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c68abaa3/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 1794f3e..8df7edb 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0/version
+

[1/2] spark git commit: Preparing Spark release v1.4.0-rc3

2015-05-29 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 18811ca20 - e549874c3


Preparing Spark release v1.4.0-rc3


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dd109a87
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dd109a87
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dd109a87

Branch: refs/heads/branch-1.4
Commit: dd109a8746ec07c7c83995890fc2c0cd7a693730
Parents: 18811ca
Author: Patrick Wendell pwend...@gmail.com
Authored: Fri May 29 13:06:59 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Fri May 29 13:06:59 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 626c857..b8a821d 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index 1f3dec9..c1aa32b 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index e58efe4..a9b8b42 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index e4efee7..38ff67d 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 1f3e619..e8784eb 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/dd109a87/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 8df7edb..1794f3e 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 

spark git commit: [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior

2015-05-29 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 62df047a3 - 645e61164


[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test 
to ensure expected logging behavior

The `HiveThriftServer2Test` relies on proper logging behavior to assert whether 
the Thrift server daemon process is started successfully. However, some other 
jar files listed in the classpath may potentially contain an unexpected Log4J 
configuration file which overrides the logging behavior.

This PR writes a temporary `log4j.properties` and prepend it to driver 
classpath before starting the testing Thrift server process to ensure proper 
logging behavior.

cc andrewor14 yhuai

Author: Cheng Lian l...@databricks.com

Closes #6493 from liancheng/override-log4j and squashes the following commits:

c489e0e [Cheng Lian] Fixes minor Scala styling issue
b46ef0d [Cheng Lian] Uses a temporary log4j.properties in HiveThriftServer2Test 
to ensure expected logging behavior

(cherry picked from commit 4782e130400f16e77c8b7f7fe8791acae1c5f8f1)
Signed-off-by: Andrew Or and...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/645e6116
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/645e6116
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/645e6116

Branch: refs/heads/branch-1.4
Commit: 645e611644be3f62ef07e4ca7628bf298349d9a6
Parents: 62df047
Author: Cheng Lian l...@databricks.com
Authored: Fri May 29 11:11:40 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Fri May 29 11:11:47 2015 -0700

--
 .../thriftserver/HiveThriftServer2Suites.scala  | 31 
 1 file changed, 25 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/645e6116/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
--
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
index 1fadea9..610939c 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
@@ -19,6 +19,8 @@ package org.apache.spark.sql.hive.thriftserver
 
 import java.io.File
 import java.net.URL
+import java.nio.charset.StandardCharsets
+import java.nio.file.{Files, Paths}
 import java.sql.{Date, DriverManager, Statement}
 
 import scala.collection.mutable.ArrayBuffer
@@ -54,7 +56,7 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest {
   override def mode: ServerMode.Value = ServerMode.binary
 
   private def withCLIServiceClient(f: ThriftCLIServiceClient = Unit): Unit = {
-// Transport creation logics below mimics 
HiveConnection.createBinaryTransport
+// Transport creation logic below mimics 
HiveConnection.createBinaryTransport
 val rawTransport = new TSocket(localhost, serverPort)
 val user = System.getProperty(user.name)
 val transport = PlainSaslHelper.getPlainTransport(user, anonymous, 
rawTransport)
@@ -391,10 +393,10 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 val statements = connections.map(_.createStatement())
 
 try {
-  statements.zip(fs).map { case (s, f) = f(s) }
+  statements.zip(fs).foreach { case (s, f) = f(s) }
 } finally {
-  statements.map(_.close())
-  connections.map(_.close())
+  statements.foreach(_.close())
+  connections.foreach(_.close())
 }
   }
 
@@ -433,15 +435,32 @@ abstract class HiveThriftServer2Test extends FunSuite 
with BeforeAndAfterAll wit
   ConfVars.HIVE_SERVER2_THRIFT_HTTP_PORT
 }
 
+val driverClassPath = {
+  // Writes a temporary log4j.properties and prepend it to driver 
classpath, so that it
+  // overrides all other potential log4j configurations contained in other 
dependency jar files.
+  val tempLog4jConf = Utils.createTempDir().getCanonicalPath
+
+  Files.write(
+Paths.get(s$tempLog4jConf/log4j.properties),
+log4j.rootCategory=INFO, console
+  |log4j.appender.console=org.apache.log4j.ConsoleAppender
+  |log4j.appender.console.target=System.err
+  |log4j.appender.console.layout=org.apache.log4j.PatternLayout
+  |log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd 
HH:mm:ss} %p %c{1}: %m%n
+.stripMargin.getBytes(StandardCharsets.UTF_8))
+
+  tempLog4jConf + File.pathSeparator + sys.props(java.class.path)
+}
+
 s$startScript
|  --master local
-   |  --hiveconf 

spark git commit: [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior

2015-05-29 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master e7b617755 - 4782e1304


[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test 
to ensure expected logging behavior

The `HiveThriftServer2Test` relies on proper logging behavior to assert whether 
the Thrift server daemon process is started successfully. However, some other 
jar files listed in the classpath may potentially contain an unexpected Log4J 
configuration file which overrides the logging behavior.

This PR writes a temporary `log4j.properties` and prepend it to driver 
classpath before starting the testing Thrift server process to ensure proper 
logging behavior.

cc andrewor14 yhuai

Author: Cheng Lian l...@databricks.com

Closes #6493 from liancheng/override-log4j and squashes the following commits:

c489e0e [Cheng Lian] Fixes minor Scala styling issue
b46ef0d [Cheng Lian] Uses a temporary log4j.properties in HiveThriftServer2Test 
to ensure expected logging behavior


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4782e130
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4782e130
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4782e130

Branch: refs/heads/master
Commit: 4782e130400f16e77c8b7f7fe8791acae1c5f8f1
Parents: e7b6177
Author: Cheng Lian l...@databricks.com
Authored: Fri May 29 11:11:40 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Fri May 29 11:11:40 2015 -0700

--
 .../thriftserver/HiveThriftServer2Suites.scala  | 31 
 1 file changed, 25 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4782e130/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
--
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
index 1fadea9..610939c 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala
@@ -19,6 +19,8 @@ package org.apache.spark.sql.hive.thriftserver
 
 import java.io.File
 import java.net.URL
+import java.nio.charset.StandardCharsets
+import java.nio.file.{Files, Paths}
 import java.sql.{Date, DriverManager, Statement}
 
 import scala.collection.mutable.ArrayBuffer
@@ -54,7 +56,7 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest {
   override def mode: ServerMode.Value = ServerMode.binary
 
   private def withCLIServiceClient(f: ThriftCLIServiceClient = Unit): Unit = {
-// Transport creation logics below mimics 
HiveConnection.createBinaryTransport
+// Transport creation logic below mimics 
HiveConnection.createBinaryTransport
 val rawTransport = new TSocket(localhost, serverPort)
 val user = System.getProperty(user.name)
 val transport = PlainSaslHelper.getPlainTransport(user, anonymous, 
rawTransport)
@@ -391,10 +393,10 @@ abstract class HiveThriftJdbcTest extends 
HiveThriftServer2Test {
 val statements = connections.map(_.createStatement())
 
 try {
-  statements.zip(fs).map { case (s, f) = f(s) }
+  statements.zip(fs).foreach { case (s, f) = f(s) }
 } finally {
-  statements.map(_.close())
-  connections.map(_.close())
+  statements.foreach(_.close())
+  connections.foreach(_.close())
 }
   }
 
@@ -433,15 +435,32 @@ abstract class HiveThriftServer2Test extends FunSuite 
with BeforeAndAfterAll wit
   ConfVars.HIVE_SERVER2_THRIFT_HTTP_PORT
 }
 
+val driverClassPath = {
+  // Writes a temporary log4j.properties and prepend it to driver 
classpath, so that it
+  // overrides all other potential log4j configurations contained in other 
dependency jar files.
+  val tempLog4jConf = Utils.createTempDir().getCanonicalPath
+
+  Files.write(
+Paths.get(s$tempLog4jConf/log4j.properties),
+log4j.rootCategory=INFO, console
+  |log4j.appender.console=org.apache.log4j.ConsoleAppender
+  |log4j.appender.console.target=System.err
+  |log4j.appender.console.layout=org.apache.log4j.PatternLayout
+  |log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd 
HH:mm:ss} %p %c{1}: %m%n
+.stripMargin.getBytes(StandardCharsets.UTF_8))
+
+  tempLog4jConf + File.pathSeparator + sys.props(java.class.path)
+}
+
 s$startScript
|  --master local
-   |  --hiveconf hive.root.logger=INFO,console
|  --hiveconf ${ConfVars.METASTORECONNECTURLKEY}=$metastoreJdbcUri
|  --hiveconf 

[1/2] spark git commit: Preparing Spark release v1.4.0-rc3

2015-05-29 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 4be701aa5 - c68abaa34


Preparing Spark release v1.4.0-rc3


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fb60503f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fb60503f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fb60503f

Branch: refs/heads/branch-1.4
Commit: fb60503ff231589e9c3f0b6db350399d3063beb9
Parents: 4be701a
Author: Patrick Wendell pwend...@gmail.com
Authored: Fri May 29 12:15:13 2015 -0700
Committer: Patrick Wendell pwend...@gmail.com
Committed: Fri May 29 12:15:13 2015 -0700

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 launcher/pom.xml  | 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 unsafe/pom.xml| 2 +-
 yarn/pom.xml  | 2 +-
 30 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 626c857..b8a821d 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index 1f3dec9..c1aa32b 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index e58efe4..a9b8b42 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index e4efee7..38ff67d 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 1f3e619..e8784eb 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.4.0-SNAPSHOT/version
+version1.4.0/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/fb60503f/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 8df7edb..1794f3e 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 

spark git commit: [SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide

2015-05-29 Thread davies
Repository: spark
Updated Branches:
  refs/heads/master 9eb222c13 - 5f48e5c33


[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide

This PR adds a new SparkR programming guide at the top-level. This will be 
useful for R users as our APIs don't directly match the Scala/Python APIs and 
as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman shiva...@cs.berkeley.edu

Closes #6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in 
example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update 
write.df, read.df to handle defaults better


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5f48e5c3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5f48e5c3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5f48e5c3

Branch: refs/heads/master
Commit: 5f48e5c33bafa376be5741e260a037c66103fdcd
Parents: 9eb222c
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Authored: Fri May 29 14:11:58 2015 -0700
Committer: Davies Liu dav...@databricks.com
Committed: Fri May 29 14:11:58 2015 -0700

--
 R/pkg/R/DataFrame.R   |  10 +-
 R/pkg/R/SQLContext.R  |   5 +
 R/pkg/R/generics.R|   4 +-
 docs/_layouts/global.html |   1 +
 docs/index.md |   2 +-
 docs/sparkr.md| 223 +
 docs/sql-programming-guide.md |   4 +-
 7 files changed, 238 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5f48e5c3/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index ed8093c..e79d324 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -1314,9 +1314,8 @@ setMethod(except,
 #' write.df(df, myfile, parquet, overwrite)
 #' }
 setMethod(write.df,
-  signature(df = DataFrame, path = 'character', source = 'character',
-mode = 'character'),
-  function(df, path = NULL, source = NULL, mode = append, ...){
+  signature(df = DataFrame, path = 'character'),
+  function(df, path, source = NULL, mode = append, ...){
 if (is.null(source)) {
   sqlContext - get(.sparkRSQLsc, envir = .sparkREnv)
   source - callJMethod(sqlContext, getConf, 
spark.sql.sources.default,
@@ -1338,9 +1337,8 @@ setMethod(write.df,
 #' @aliases saveDF
 #' @export
 setMethod(saveDF,
-  signature(df = DataFrame, path = 'character', source = 'character',
-mode = 'character'),
-  function(df, path = NULL, source = NULL, mode = append, ...){
+  signature(df = DataFrame, path = 'character'),
+  function(df, path, source = NULL, mode = append, ...){
 write.df(df, path, source, mode, ...)
   })
 

http://git-wip-us.apache.org/repos/asf/spark/blob/5f48e5c3/R/pkg/R/SQLContext.R
--
diff --git a/R/pkg/R/SQLContext.R b/R/pkg/R/SQLContext.R
index 36cc612..88e1a50 100644
--- a/R/pkg/R/SQLContext.R
+++ b/R/pkg/R/SQLContext.R
@@ -457,6 +457,11 @@ read.df - function(sqlContext, path = NULL, source = 
NULL, ...) {
   if (!is.null(path)) {
 options[['path']] - path
   }
+  if (is.null(source)) {
+sqlContext - get(.sparkRSQLsc, envir = .sparkREnv)
+source - callJMethod(sqlContext, getConf, spark.sql.sources.default,
+  org.apache.spark.sql.parquet)
+  }
   sdf - callJMethod(sqlContext, load, source, options)
   dataFrame(sdf)
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/5f48e5c3/R/pkg/R/generics.R
--
diff --git a/R/pkg/R/generics.R b/R/pkg/R/generics.R
index a23d3b2..1f4fc6a 100644
--- a/R/pkg/R/generics.R
+++ b/R/pkg/R/generics.R
@@ -482,11 +482,11 @@ setGeneric(saveAsTable, function(df, tableName, source, 
mode, ...) {
 
 #' @rdname write.df
 #' @export
-setGeneric(write.df, function(df, path, source, mode, ...) { 
standardGeneric(write.df) })
+setGeneric(write.df, function(df, path, ...) { standardGeneric(write.df) })
 
 #' @rdname write.df
 #' @export
-setGeneric(saveDF, function(df, path, source, mode, ...) { 
standardGeneric(saveDF) })
+setGeneric(saveDF, function(df, path, ...) { standardGeneric(saveDF) })
 
 #' @rdname schema
 #' 

spark git commit: [SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide

2015-05-29 Thread davies
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 f40605f06 - cf4122e4d


[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide

This PR adds a new SparkR programming guide at the top-level. This will be 
useful for R users as our APIs don't directly match the Scala/Python APIs and 
as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman shiva...@cs.berkeley.edu

Closes #6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in 
example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update 
write.df, read.df to handle defaults better

(cherry picked from commit 5f48e5c33bafa376be5741e260a037c66103fdcd)
Signed-off-by: Davies Liu dav...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cf4122e4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cf4122e4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cf4122e4

Branch: refs/heads/branch-1.4
Commit: cf4122e4d4685ce1b93bd4ba6012ea98932d259c
Parents: f40605f
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Authored: Fri May 29 14:11:58 2015 -0700
Committer: Davies Liu dav...@databricks.com
Committed: Fri May 29 14:12:18 2015 -0700

--
 R/pkg/R/DataFrame.R   |  10 +-
 R/pkg/R/SQLContext.R  |   5 +
 R/pkg/R/generics.R|   4 +-
 docs/_layouts/global.html |   1 +
 docs/index.md |   2 +-
 docs/sparkr.md| 223 +
 docs/sql-programming-guide.md |   4 +-
 7 files changed, 238 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cf4122e4/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index ed8093c..e79d324 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -1314,9 +1314,8 @@ setMethod(except,
 #' write.df(df, myfile, parquet, overwrite)
 #' }
 setMethod(write.df,
-  signature(df = DataFrame, path = 'character', source = 'character',
-mode = 'character'),
-  function(df, path = NULL, source = NULL, mode = append, ...){
+  signature(df = DataFrame, path = 'character'),
+  function(df, path, source = NULL, mode = append, ...){
 if (is.null(source)) {
   sqlContext - get(.sparkRSQLsc, envir = .sparkREnv)
   source - callJMethod(sqlContext, getConf, 
spark.sql.sources.default,
@@ -1338,9 +1337,8 @@ setMethod(write.df,
 #' @aliases saveDF
 #' @export
 setMethod(saveDF,
-  signature(df = DataFrame, path = 'character', source = 'character',
-mode = 'character'),
-  function(df, path = NULL, source = NULL, mode = append, ...){
+  signature(df = DataFrame, path = 'character'),
+  function(df, path, source = NULL, mode = append, ...){
 write.df(df, path, source, mode, ...)
   })
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cf4122e4/R/pkg/R/SQLContext.R
--
diff --git a/R/pkg/R/SQLContext.R b/R/pkg/R/SQLContext.R
index 36cc612..88e1a50 100644
--- a/R/pkg/R/SQLContext.R
+++ b/R/pkg/R/SQLContext.R
@@ -457,6 +457,11 @@ read.df - function(sqlContext, path = NULL, source = 
NULL, ...) {
   if (!is.null(path)) {
 options[['path']] - path
   }
+  if (is.null(source)) {
+sqlContext - get(.sparkRSQLsc, envir = .sparkREnv)
+source - callJMethod(sqlContext, getConf, spark.sql.sources.default,
+  org.apache.spark.sql.parquet)
+  }
   sdf - callJMethod(sqlContext, load, source, options)
   dataFrame(sdf)
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/cf4122e4/R/pkg/R/generics.R
--
diff --git a/R/pkg/R/generics.R b/R/pkg/R/generics.R
index a23d3b2..1f4fc6a 100644
--- a/R/pkg/R/generics.R
+++ b/R/pkg/R/generics.R
@@ -482,11 +482,11 @@ setGeneric(saveAsTable, function(df, tableName, source, 
mode, ...) {
 
 #' @rdname write.df
 #' @export
-setGeneric(write.df, function(df, path, source, mode, ...) { 
standardGeneric(write.df) })
+setGeneric(write.df, function(df, path, ...) { standardGeneric(write.df) })
 
 #' @rdname write.df
 #' @export
-setGeneric(saveDF, function(df, path, source, mode, ...) { 

[2/2] spark git commit: [SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict

2015-05-29 Thread davies
[SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict

This PR makes the types module in `pyspark/sql/types` work with pylint static 
analysis by removing the dynamic naming of the `pyspark/sql/_types` module to 
`pyspark/sql/types`.

Tests are now loaded using `$PYSPARK_DRIVER_PYTHON -m module` rather than 
`$PYSPARK_DRIVER_PYTHON module.py`. The old method adds the location of 
`module.py` to `sys.path`, so this change prevents accidental use of relative 
paths in Python.

Author: Michael Nazario mnaza...@palantir.com

Closes #6439 from mnazario/feature/SPARK-7899 and squashes the following 
commits:

366ef30 [Michael Nazario] Remove hack on random.py
bb8b04d [Michael Nazario] Make doctests consistent with other tests
6ee4f75 [Michael Nazario] Change test scripts to use -m
673528f [Michael Nazario] Move _types back to types


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1c5b1982
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1c5b1982
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1c5b1982

Branch: refs/heads/master
Commit: 1c5b19827a091b5aba69a967600e7ca35ed3bcfd
Parents: 5f48e5c
Author: Michael Nazario mnaza...@palantir.com
Authored: Fri May 29 14:13:44 2015 -0700
Committer: Davies Liu dav...@databricks.com
Committed: Fri May 29 14:13:44 2015 -0700

--
 bin/pyspark  |6 +-
 python/pyspark/accumulators.py   |4 +
 python/pyspark/mllib/__init__.py |8 -
 python/pyspark/mllib/rand.py |  409 ---
 python/pyspark/mllib/random.py   |  409 +++
 python/pyspark/sql/__init__.py   |   12 -
 python/pyspark/sql/_types.py | 1306 -
 python/pyspark/sql/types.py  | 1306 +
 python/run-tests |   76 +-
 9 files changed, 1758 insertions(+), 1778 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1c5b1982/bin/pyspark
--
diff --git a/bin/pyspark b/bin/pyspark
index 8acad61..7cb19c5 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -90,11 +90,7 @@ if [[ -n $SPARK_TESTING ]]; then
   unset YARN_CONF_DIR
   unset HADOOP_CONF_DIR
   export PYTHONHASHSEED=0
-  if [[ -n $PYSPARK_DOC_TEST ]]; then
-exec $PYSPARK_DRIVER_PYTHON -m doctest $1
-  else
-exec $PYSPARK_DRIVER_PYTHON $1
-  fi
+  exec $PYSPARK_DRIVER_PYTHON -m $1
   exit
 fi
 

http://git-wip-us.apache.org/repos/asf/spark/blob/1c5b1982/python/pyspark/accumulators.py
--
diff --git a/python/pyspark/accumulators.py b/python/pyspark/accumulators.py
index 0d21a13..adca90d 100644
--- a/python/pyspark/accumulators.py
+++ b/python/pyspark/accumulators.py
@@ -261,3 +261,7 @@ def _start_update_server():
 thread.daemon = True
 thread.start()
 return server
+
+if __name__ == __main__:
+import doctest
+doctest.testmod()

http://git-wip-us.apache.org/repos/asf/spark/blob/1c5b1982/python/pyspark/mllib/__init__.py
--
diff --git a/python/pyspark/mllib/__init__.py b/python/pyspark/mllib/__init__.py
index 07507b2..b11aed2 100644
--- a/python/pyspark/mllib/__init__.py
+++ b/python/pyspark/mllib/__init__.py
@@ -28,11 +28,3 @@ if numpy.version.version  '1.4':
 
 __all__ = ['classification', 'clustering', 'feature', 'fpm', 'linalg', 
'random',
'recommendation', 'regression', 'stat', 'tree', 'util']
-
-import sys
-from . import rand as random
-modname = __name__ + '.random'
-random.__name__ = modname
-random.RandomRDDs.__module__ = modname
-sys.modules[modname] = random
-del modname, sys

http://git-wip-us.apache.org/repos/asf/spark/blob/1c5b1982/python/pyspark/mllib/rand.py
--
diff --git a/python/pyspark/mllib/rand.py b/python/pyspark/mllib/rand.py
deleted file mode 100644
index 06fbc0e..000
--- a/python/pyspark/mllib/rand.py
+++ /dev/null
@@ -1,409 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the License); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an AS IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and

[1/2] spark git commit: [SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict

2015-05-29 Thread davies
Repository: spark
Updated Branches:
  refs/heads/master 5f48e5c33 - 1c5b19827


http://git-wip-us.apache.org/repos/asf/spark/blob/1c5b1982/python/pyspark/sql/types.py
--
diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
new file mode 100644
index 000..9e7e9f0
--- /dev/null
+++ b/python/pyspark/sql/types.py
@@ -0,0 +1,1306 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the License); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import sys
+import decimal
+import time
+import datetime
+import keyword
+import warnings
+import json
+import re
+import weakref
+from array import array
+from operator import itemgetter
+
+if sys.version = 3:
+long = int
+unicode = str
+
+from py4j.protocol import register_input_converter
+from py4j.java_gateway import JavaClass
+
+__all__ = [
+DataType, NullType, StringType, BinaryType, BooleanType, 
DateType,
+TimestampType, DecimalType, DoubleType, FloatType, ByteType, 
IntegerType,
+LongType, ShortType, ArrayType, MapType, StructField, 
StructType]
+
+
+class DataType(object):
+Base class for data types.
+
+def __repr__(self):
+return self.__class__.__name__
+
+def __hash__(self):
+return hash(str(self))
+
+def __eq__(self, other):
+return isinstance(other, self.__class__) and self.__dict__ == 
other.__dict__
+
+def __ne__(self, other):
+return not self.__eq__(other)
+
+@classmethod
+def typeName(cls):
+return cls.__name__[:-4].lower()
+
+def simpleString(self):
+return self.typeName()
+
+def jsonValue(self):
+return self.typeName()
+
+def json(self):
+return json.dumps(self.jsonValue(),
+  separators=(',', ':'),
+  sort_keys=True)
+
+
+# This singleton pattern does not work with pickle, you will get
+# another object after pickle and unpickle
+class DataTypeSingleton(type):
+Metaclass for DataType
+
+_instances = {}
+
+def __call__(cls):
+if cls not in cls._instances:
+cls._instances[cls] = super(DataTypeSingleton, cls).__call__()
+return cls._instances[cls]
+
+
+class NullType(DataType):
+Null type.
+
+The data type representing None, used for the types that cannot be 
inferred.
+
+
+__metaclass__ = DataTypeSingleton
+
+
+class AtomicType(DataType):
+An internal type used to represent everything that is not
+null, UDTs, arrays, structs, and maps.
+
+__metaclass__ = DataTypeSingleton
+
+
+class NumericType(AtomicType):
+Numeric data types.
+
+
+
+class IntegralType(NumericType):
+Integral data types.
+
+
+
+class FractionalType(NumericType):
+Fractional data types.
+
+
+
+class StringType(AtomicType):
+String data type.
+
+
+
+class BinaryType(AtomicType):
+Binary (byte array) data type.
+
+
+
+class BooleanType(AtomicType):
+Boolean data type.
+
+
+
+class DateType(AtomicType):
+Date (datetime.date) data type.
+
+
+
+class TimestampType(AtomicType):
+Timestamp (datetime.datetime) data type.
+
+
+
+class DecimalType(FractionalType):
+Decimal (decimal.Decimal) data type.
+
+
+def __init__(self, precision=None, scale=None):
+self.precision = precision
+self.scale = scale
+self.hasPrecisionInfo = precision is not None
+
+def simpleString(self):
+if self.hasPrecisionInfo:
+return decimal(%d,%d) % (self.precision, self.scale)
+else:
+return decimal(10,0)
+
+def jsonValue(self):
+if self.hasPrecisionInfo:
+return decimal(%d,%d) % (self.precision, self.scale)
+else:
+return decimal
+
+def __repr__(self):
+if self.hasPrecisionInfo:
+return DecimalType(%d,%d) % (self.precision, self.scale)
+else:
+return DecimalType()
+
+
+class DoubleType(FractionalType):
+Double data type, representing double precision floats.
+
+
+
+class FloatType(FractionalType):
+Float data type, representing single precision floats.
+
+
+
+class ByteType(IntegralType):
+Byte data type, i.e. a signed integer in a single byte.
+
+def