date:20150610

spark git commit: [SPARK-7886] Added unit test for HAVING aggregate pushdown.

2015-06-10 Thread lian

Repository: spark
Updated Branches:
  refs/heads/master 57c60c5be - e90035e67


[SPARK-7886] Added unit test for HAVING aggregate pushdown.

This is a followup to #6712.

Author: Reynold Xin r...@databricks.com

Closes #6739 from rxin/6712-followup and squashes the following commits:

fd9acfb [Reynold Xin] [SPARK-7886] Added unit test for HAVING aggregate 
pushdown.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e90035e6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e90035e6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e90035e6

Branch: refs/heads/master
Commit: e90035e676e492de840f44b61b330db526313019
Parents: 57c60c5
Author: Reynold Xin r...@databricks.com
Authored: Wed Jun 10 18:58:01 2015 +0800
Committer: Cheng Lian l...@databricks.com
Committed: Wed Jun 10 18:58:01 2015 +0800

--
 .../src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala   | 7 +++
 .../src/main/scala/org/apache/spark/sql/hive/HiveQl.scala | 1 -
 2 files changed, 7 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e90035e6/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
--
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index 5babc43..3ca5ff3 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -38,6 +38,13 @@ class SQLQuerySuite extends QueryTest with BeforeAndAfterAll 
with SQLTestUtils {
   import sqlContext.implicits._
   import sqlContext.sql
 
+  test(having clause) {
+Seq((one, 1), (two, 2), (three, 3), (one, 5)).toDF(k, 
v).registerTempTable(hav)
+checkAnswer(
+  sql(SELECT k, sum(v) FROM hav GROUP BY k HAVING sum(v)  2),
+  Row(one, 6) :: Row(three, 3) :: Nil)
+  }
+
   test(SPARK-6743: no columns from cache) {
 Seq(
   (83, 0, 38),

http://git-wip-us.apache.org/repos/asf/spark/blob/e90035e6/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
--
diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
index 041483e..ca4b80b 100644
--- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
+++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
@@ -1307,7 +1307,6 @@ 
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C
 HiveParser.DecimalLiteral)
 
   /* Case insensitive matches */
-  val COALESCE = (?i)COALESCE.r
   val COUNT = (?i)COUNT.r
   val SUM = (?i)SUM.r
   val AND = (?i)AND.r


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 778f3ca81 - 57c60c5be


[SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.

This builds on #6710 and also uses FunctionRegistry for function lookup in 
HiveContext.

Author: Reynold Xin r...@databricks.com

Closes #6712 from rxin/udf-registry-hive and squashes the following commits:

f4c2df0 [Reynold Xin] Fixed style violation.
0bd4127 [Reynold Xin] Fixed Python UDFs.
f9a0378 [Reynold Xin] Disable one more test.
5609494 [Reynold Xin] Disable some failing tests.
4efea20 [Reynold Xin] Don't check children resolved for UDF resolution.
2ebe549 [Reynold Xin] Removed more hardcoded functions.
aadce78 [Reynold Xin] [SPARK-7886] Use FunctionRegistry for built-in 
expressions in HiveContext.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/57c60c5b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/57c60c5b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/57c60c5b

Branch: refs/heads/master
Commit: 57c60c5be7aa731ca1a6966f4285eb02f481eb71
Parents: 778f3ca
Author: Reynold Xin r...@databricks.com
Authored: Wed Jun 10 00:36:16 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 00:36:16 2015 -0700

--
 .../apache/spark/sql/catalyst/SqlParser.scala   |  2 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala  |  9 +--
 .../catalyst/analysis/FunctionRegistry.scala| 12 +++-
 .../sql/catalyst/expressions/Expression.scala   |  8 +--
 .../scala/org/apache/spark/sql/SQLContext.scala |  7 +--
 .../apache/spark/sql/execution/pythonUdfs.scala | 66 +++-
 .../hive/execution/HiveCompatibilitySuite.scala | 10 +--
 .../org/apache/spark/sql/hive/HiveContext.scala |  2 +-
 .../org/apache/spark/sql/hive/HiveQl.scala  | 30 -
 .../org/apache/spark/sql/hive/hiveUdfs.scala| 51 ---
 10 files changed, 92 insertions(+), 105 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/57c60c5b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala
index f74c17d..da3a717 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala
@@ -68,7 +68,6 @@ class SqlParser extends AbstractSparkSQLParser with 
DataTypeParser {
   protected val FULL = Keyword(FULL)
   protected val GROUP = Keyword(GROUP)
   protected val HAVING = Keyword(HAVING)
-  protected val IF = Keyword(IF)
   protected val IN = Keyword(IN)
   protected val INNER = Keyword(INNER)
   protected val INSERT = Keyword(INSERT)
@@ -277,6 +276,7 @@ class SqlParser extends AbstractSparkSQLParser with 
DataTypeParser {
   lexical.normalizeKeyword(udfName) match {
 case sum = SumDistinct(exprs.head)
 case count = CountDistinct(exprs)
+case _ = throw new AnalysisException(sfunction $udfName does not 
support DISTINCT)
   }
 }
 | APPROXIMATE ~ ident ~ (( ~ DISTINCT ~ expression ~ )) ^^ { case 
udfName ~ exp =

http://git-wip-us.apache.org/repos/asf/spark/blob/57c60c5b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 02b10c4..c4f12cf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -460,7 +460,7 @@ class Analyzer(
 def apply(plan: LogicalPlan): LogicalPlan = plan transform {
   case q: LogicalPlan =
 q transformExpressions {
-  case u @ UnresolvedFunction(name, children) if u.childrenResolved =
+  case u @ UnresolvedFunction(name, children) =
 withPosition(u) {
   registry.lookupFunction(name, children)
 }
@@ -494,20 +494,21 @@ class Analyzer(
   object UnresolvedHavingClauseAttributes extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
   case filter @ Filter(havingCondition, aggregate @ Aggregate(_, 
originalAggExprs, _))
-  if aggregate.resolved  containsAggregate(havingCondition) = {
+  if aggregate.resolved  containsAggregate(havingCondition) =
+
 val evaluatedCondition = Alias(havingCondition, havingCondition)()
 val aggExprsWithHaving = evaluatedCondition +:

spark git commit: [SPARK-8215] [SPARK-8212] [SQL] add leaf math expression for e and pi

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master e90035e67 - c6ba7cca3


[SPARK-8215] [SPARK-8212] [SQL] add leaf math expression for e and pi

Author: Daoyuan Wang daoyuan.w...@intel.com

Closes #6716 from adrian-wang/epi and squashes the following commits:

e2e8dbd [Daoyuan Wang] move tests
11b351c [Daoyuan Wang] add tests and remove pu
db331c9 [Daoyuan Wang] py style
599ddd8 [Daoyuan Wang] add py
e6783ef [Daoyuan Wang] register function
82d426e [Daoyuan Wang] add function entry
dbf3ab5 [Daoyuan Wang] add PI and E


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c6ba7cca
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c6ba7cca
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c6ba7cca

Branch: refs/heads/master
Commit: c6ba7cca3338e3f4f719d86dbcff4406d949edc7
Parents: e90035e
Author: Daoyuan Wang daoyuan.w...@intel.com
Authored: Wed Jun 10 09:45:45 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 09:45:45 2015 -0700

--
 .../catalyst/analysis/FunctionRegistry.scala|  2 ++
 .../spark/sql/catalyst/expressions/math.scala   | 35 
 .../expressions/MathFunctionsSuite.scala| 22 
 .../scala/org/apache/spark/sql/functions.scala  | 18 ++
 .../spark/sql/DataFrameFunctionsSuite.scala | 19 +++
 5 files changed, 96 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c6ba7cca/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 936ffc7..ba89a5c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -106,6 +106,7 @@ object FunctionRegistry {
 expression[Cbrt](cbrt),
 expression[Ceil](ceil),
 expression[Cos](cos),
+expression[EulerNumber](e),
 expression[Exp](exp),
 expression[Expm1](expm1),
 expression[Floor](floor),
@@ -113,6 +114,7 @@ object FunctionRegistry {
 expression[Log](log),
 expression[Log10](log10),
 expression[Log1p](log1p),
+expression[Pi](pi),
 expression[Pow](pow),
 expression[Rint](rint),
 expression[Signum](signum),

http://git-wip-us.apache.org/repos/asf/spark/blob/c6ba7cca/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
index 7dacb6a..e1d8c9a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
@@ -21,8 +21,33 @@ import org.apache.spark.sql.catalyst.expressions.codegen._
 import org.apache.spark.sql.types.{DataType, DoubleType}
 
 /**
+ * A leaf expression specifically for math constants. Math constants expect no 
input.
+ * @param c The math constant.
+ * @param name The short name of the function
+ */
+abstract class LeafMathExpression(c: Double, name: String)
+  extends LeafExpression with Serializable {
+  self: Product =
+
+  override def dataType: DataType = DoubleType
+  override def foldable: Boolean = true
+  override def nullable: Boolean = false
+  override def toString: String = s$name()
+
+  override def eval(input: Row): Any = c
+
+  override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): 
String = {
+s
+  boolean ${ev.isNull} = false;
+  ${ctx.javaType(dataType)} ${ev.primitive} = java.lang.Math.$name;
+
+  }
+}
+
+/**
  * A unary expression specifically for math functions. Math Functions expect a 
specific type of
  * input format, therefore these functions extend `ExpectsInputTypes`.
+ * @param f The math function.
  * @param name The short name of the function
  */
 abstract class UnaryMathExpression(f: Double = Double, name: String)
@@ -100,6 +125,16 @@ abstract class BinaryMathExpression(f: (Double, Double) = 
Double, name: String)
 
 

 

+// Leaf math functions
+

spark git commit: [SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 7b88e6a1e - 28e8a6ea6


[SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

Read number of threads for RBackend from configuration.

[SPARK-8282] #comment Linking with JIRA

Author: Hossein hoss...@databricks.com

Closes #6730 from falaki/SPARK-8282 and squashes the following commits:

33b3d98 [Hossein] Documented new config parameter
70f2a9c [Hossein] Fixing import
ec44225 [Hossein] Read number of threads for RBackend from configuration

(cherry picked from commit 30ebf1a233295539c2455bd838bae7315711e1e2)
Signed-off-by: Andrew Or and...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/28e8a6ea
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/28e8a6ea
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/28e8a6ea

Branch: refs/heads/branch-1.4
Commit: 28e8a6ea65fd08ab9cefc4d179d5c66ffefd3eb4
Parents: 7b88e6a
Author: Hossein hoss...@databricks.com
Authored: Wed Jun 10 13:18:48 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:19:53 2015 -0700

--
 .../main/scala/org/apache/spark/api/r/RBackend.scala|  5 +++--
 docs/configuration.md   | 12 
 2 files changed, 15 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/28e8a6ea/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/r/RBackend.scala 
b/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
index d24c650..1a5f2bc 100644
--- a/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
+++ b/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
@@ -29,7 +29,7 @@ import io.netty.channel.socket.nio.NioServerSocketChannel
 import io.netty.handler.codec.LengthFieldBasedFrameDecoder
 import io.netty.handler.codec.bytes.{ByteArrayDecoder, ByteArrayEncoder}
 
-import org.apache.spark.Logging
+import org.apache.spark.{Logging, SparkConf}
 
 /**
  * Netty-based backend server that is used to communicate between R and Java.
@@ -41,7 +41,8 @@ private[spark] class RBackend {
   private[this] var bossGroup: EventLoopGroup = null
 
   def init(): Int = {
-bossGroup = new NioEventLoopGroup(2)
+val conf = new SparkConf()
+bossGroup = new 
NioEventLoopGroup(conf.getInt(spark.r.numRBackendThreads, 2))
 val workerGroup = bossGroup
 val handler = new RBackendHandler(this)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/28e8a6ea/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index 3960e7e..95a322f 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1495,6 +1495,18 @@ Apart from these, the following properties are also 
available, and may be useful
 /tr
 /table
 
+ SparkR
+table class=table
+trthProperty Name/ththDefault/ththMeaning/th/tr
+tr
+  tdcodespark.r.numRBackendThreads/code/td
+  td2/td
+  td
+Number of threads used by RBackend to handle RPC calls from SparkR package.
+  /td
+/tr
+/table
+
  Cluster Managers
 Each cluster manager in Spark has additional configuration options. 
Configurations
 can be found on the pages for each mode:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master 38112905b - 30ebf1a23


[SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

Read number of threads for RBackend from configuration.

[SPARK-8282] #comment Linking with JIRA

Author: Hossein hoss...@databricks.com

Closes #6730 from falaki/SPARK-8282 and squashes the following commits:

33b3d98 [Hossein] Documented new config parameter
70f2a9c [Hossein] Fixing import
ec44225 [Hossein] Read number of threads for RBackend from configuration


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/30ebf1a2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/30ebf1a2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/30ebf1a2

Branch: refs/heads/master
Commit: 30ebf1a233295539c2455bd838bae7315711e1e2
Parents: 3811290
Author: Hossein hoss...@databricks.com
Authored: Wed Jun 10 13:18:48 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:19:44 2015 -0700

--
 .../main/scala/org/apache/spark/api/r/RBackend.scala|  5 +++--
 docs/configuration.md   | 12 
 2 files changed, 15 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/30ebf1a2/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/r/RBackend.scala 
b/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
index d24c650..1a5f2bc 100644
--- a/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
+++ b/core/src/main/scala/org/apache/spark/api/r/RBackend.scala
@@ -29,7 +29,7 @@ import io.netty.channel.socket.nio.NioServerSocketChannel
 import io.netty.handler.codec.LengthFieldBasedFrameDecoder
 import io.netty.handler.codec.bytes.{ByteArrayDecoder, ByteArrayEncoder}
 
-import org.apache.spark.Logging
+import org.apache.spark.{Logging, SparkConf}
 
 /**
  * Netty-based backend server that is used to communicate between R and Java.
@@ -41,7 +41,8 @@ private[spark] class RBackend {
   private[this] var bossGroup: EventLoopGroup = null
 
   def init(): Int = {
-bossGroup = new NioEventLoopGroup(2)
+val conf = new SparkConf()
+bossGroup = new 
NioEventLoopGroup(conf.getInt(spark.r.numRBackendThreads, 2))
 val workerGroup = bossGroup
 val handler = new RBackendHandler(this)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/30ebf1a2/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index 3960e7e..95a322f 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1495,6 +1495,18 @@ Apart from these, the following properties are also 
available, and may be useful
 /tr
 /table
 
+ SparkR
+table class=table
+trthProperty Name/ththDefault/ththMeaning/th/tr
+tr
+  tdcodespark.r.numRBackendThreads/code/td
+  td2/td
+  td
+Number of threads used by RBackend to handle RPC calls from SparkR package.
+  /td
+/tr
+/table
+
  Cluster Managers
 Each cluster manager in Spark has additional configuration options. 
Configurations
 can be found on the pages for each mode:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7756] CORE RDDOperationScope fix for IBM Java

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master 30ebf1a23 - 19e30b48f


[SPARK-7756] CORE RDDOperationScope fix for IBM Java

IBM Java has an extra method when we do getStackTrace(): this is 
getStackTraceImpl, a native method. This causes two tests to fail within 
DStreamScopeSuite when running with IBM Java. Instead of map or filter 
being the method names found, getStackTrace is returned. This commit 
addresses such an issue by using dropWhile. Given that our current method is 
withScope, we look for the next method that isn't ours: we don't care about 
methods that come before us in the stack trace: e.g. getStackTrace (regardless 
of how many levels this might go).

IBM:
java.lang.Thread.getStackTraceImpl(Native Method)
java.lang.Thread.getStackTrace(Thread.java:1117)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:104)

Oracle:
PRINTING STACKTRACE!!!
java.lang.Thread.getStackTrace(Thread.java:1552)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:106)

I've tested this with Oracle and IBM Java, no side effects for other tests 
introduced.

Author: Adam Roberts arobe...@uk.ibm.com
Author: a-roberts arobe...@uk.ibm.com

Closes #6740 from a-roberts/RDDScopeStackCrawlFix and squashes the following 
commits:

13ce390 [Adam Roberts] Ensure consistency with String equality checking
a4fc0e0 [a-roberts] Update RDDOperationScope.scala


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/19e30b48
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/19e30b48
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/19e30b48

Branch: refs/heads/master
Commit: 19e30b48f3c6d0b72871d3e15b9564c1b2822700
Parents: 30ebf1a
Author: Adam Roberts arobe...@uk.ibm.com
Authored: Wed Jun 10 13:21:01 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:21:51 2015 -0700

--
 .../main/scala/org/apache/spark/rdd/RDDOperationScope.scala   | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/19e30b48/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
index 6b09dfa..4466728 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
@@ -95,10 +95,9 @@ private[spark] object RDDOperationScope extends Logging {
   private[spark] def withScope[T](
   sc: SparkContext,
   allowNesting: Boolean = false)(body: = T): T = {
-val stackTrace = Thread.currentThread.getStackTrace().tail // ignore 
Thread#getStackTrace
-val ourMethodName = stackTrace(1).getMethodName // i.e. withScope
-// Climb upwards to find the first method that's called something different
-val callerMethodName = stackTrace
+val ourMethodName = withScope
+val callerMethodName = Thread.currentThread.getStackTrace()
+  .dropWhile(_.getMethodName != ourMethodName)
   .find(_.getMethodName != ourMethodName)
   .map(_.getMethodName)
   .getOrElse {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7756] CORE RDDOperationScope fix for IBM Java

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 28e8a6ea6 - 568d1d51d


[SPARK-7756] CORE RDDOperationScope fix for IBM Java

IBM Java has an extra method when we do getStackTrace(): this is 
getStackTraceImpl, a native method. This causes two tests to fail within 
DStreamScopeSuite when running with IBM Java. Instead of map or filter 
being the method names found, getStackTrace is returned. This commit 
addresses such an issue by using dropWhile. Given that our current method is 
withScope, we look for the next method that isn't ours: we don't care about 
methods that come before us in the stack trace: e.g. getStackTrace (regardless 
of how many levels this might go).

IBM:
java.lang.Thread.getStackTraceImpl(Native Method)
java.lang.Thread.getStackTrace(Thread.java:1117)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:104)

Oracle:
PRINTING STACKTRACE!!!
java.lang.Thread.getStackTrace(Thread.java:1552)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:106)

I've tested this with Oracle and IBM Java, no side effects for other tests 
introduced.

Author: Adam Roberts arobe...@uk.ibm.com
Author: a-roberts arobe...@uk.ibm.com

Closes #6740 from a-roberts/RDDScopeStackCrawlFix and squashes the following 
commits:

13ce390 [Adam Roberts] Ensure consistency with String equality checking
a4fc0e0 [a-roberts] Update RDDOperationScope.scala

(cherry picked from commit 19e30b48f3c6d0b72871d3e15b9564c1b2822700)
Signed-off-by: Andrew Or and...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/568d1d51
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/568d1d51
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/568d1d51

Branch: refs/heads/branch-1.4
Commit: 568d1d51d695bea4389f4470cd98707f3049885a
Parents: 28e8a6e
Author: Adam Roberts arobe...@uk.ibm.com
Authored: Wed Jun 10 13:21:01 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:21:59 2015 -0700

--
 .../main/scala/org/apache/spark/rdd/RDDOperationScope.scala   | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/568d1d51/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
index 6b09dfa..4466728 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
@@ -95,10 +95,9 @@ private[spark] object RDDOperationScope extends Logging {
   private[spark] def withScope[T](
   sc: SparkContext,
   allowNesting: Boolean = false)(body: = T): T = {
-val stackTrace = Thread.currentThread.getStackTrace().tail // ignore 
Thread#getStackTrace
-val ourMethodName = stackTrace(1).getMethodName // i.e. withScope
-// Climb upwards to find the first method that's called something different
-val callerMethodName = stackTrace
+val ourMethodName = withScope
+val callerMethodName = Thread.currentThread.getStackTrace()
+  .dropWhile(_.getMethodName != ourMethodName)
   .find(_.getMethodName != ourMethodName)
   .map(_.getMethodName)
   .getOrElse {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7261] [CORE] Change default log level to WARN in the REPL

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master e90c9d92d - 80043e9e7


[SPARK-7261] [CORE] Change default log level to WARN in the REPL

1. Add `log4j-defaults-repl.properties` that has log level WARN.
2. When logging is initialized, check whether inside the REPL. If so, use 
`log4j-defaults-repl.properties`.
3. Print the following information if using `log4j-defaults-repl.properties`:
```
Using Spark's repl log4j profile: 
org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel(INFO)
```

Author: zsxwing zsxw...@gmail.com

Closes #6734 from zsxwing/log4j-repl and squashes the following commits:

3835eff [zsxwing] Change default log level to WARN in the REPL


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80043e9e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80043e9e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80043e9e

Branch: refs/heads/master
Commit: 80043e9e761c44ce2c3a432dcd1989be573f8bb4
Parents: e90c9d9
Author: zsxwing zsxw...@gmail.com
Authored: Wed Jun 10 13:25:59 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:26:33 2015 -0700

--
 .rat-excludes   |  1 +
 .../apache/spark/log4j-defaults-repl.properties | 12 +
 .../main/scala/org/apache/spark/Logging.scala   | 26 ++--
 3 files changed, 32 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/80043e9e/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index 994c7e8..aa008e6 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -28,6 +28,7 @@ spark-env.sh
 spark-env.cmd
 spark-env.sh.template
 log4j-defaults.properties
+log4j-defaults-repl.properties
 bootstrap-tooltip.js
 jquery-1.11.1.min.js
 d3.min.js

http://git-wip-us.apache.org/repos/asf/spark/blob/80043e9e/core/src/main/resources/org/apache/spark/log4j-defaults-repl.properties
--
diff --git 
a/core/src/main/resources/org/apache/spark/log4j-defaults-repl.properties 
b/core/src/main/resources/org/apache/spark/log4j-defaults-repl.properties
new file mode 100644
index 000..b146f8a
--- /dev/null
+++ b/core/src/main/resources/org/apache/spark/log4j-defaults-repl.properties
@@ -0,0 +1,12 @@
+# Set everything to be logged to the console
+log4j.rootCategory=WARN, console
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p 
%c{1}: %m%n
+
+# Settings to quiet third party logs that are too verbose
+log4j.logger.org.spark-project.jetty=WARN
+log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
+log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
+log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

http://git-wip-us.apache.org/repos/asf/spark/blob/80043e9e/core/src/main/scala/org/apache/spark/Logging.scala
--
diff --git a/core/src/main/scala/org/apache/spark/Logging.scala 
b/core/src/main/scala/org/apache/spark/Logging.scala
index 419d093..7fcb783 100644
--- a/core/src/main/scala/org/apache/spark/Logging.scala
+++ b/core/src/main/scala/org/apache/spark/Logging.scala
@@ -121,13 +121,25 @@ trait Logging {
 if (usingLog4j12) {
   val log4j12Initialized = 
LogManager.getRootLogger.getAllAppenders.hasMoreElements
   if (!log4j12Initialized) {
-val defaultLogProps = org/apache/spark/log4j-defaults.properties
-Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
-  case Some(url) =
-PropertyConfigurator.configure(url)
-System.err.println(sUsing Spark's default log4j profile: 
$defaultLogProps)
-  case None =
-System.err.println(sSpark was unable to load $defaultLogProps)
+if (Utils.isInInterpreter) {
+  val replDefaultLogProps = 
org/apache/spark/log4j-defaults-repl.properties
+  Option(Utils.getSparkClassLoader.getResource(replDefaultLogProps)) 
match {
+case Some(url) =
+  PropertyConfigurator.configure(url)
+  System.err.println(sUsing Spark's repl log4j profile: 
$replDefaultLogProps)
+  System.err.println(To adjust logging level use 
sc.setLogLevel(\INFO\))
+case None =
+  System.err.println(sSpark was unable to load 
$replDefaultLogProps)
+  }
+} else {
+  val defaultLogProps = org/apache/spark/log4j-defaults.properties
+

spark git commit: [SPARK-8273] Driver hangs up when yarn shutdown in client mode

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master cb871c44c - 5014d0ed7


[SPARK-8273] Driver hangs up when yarn shutdown in client mode

In client mode, if yarn was shut down with spark application running, the 
application will hang up after several retries(default: 30) because the 
exception throwed by YarnClientImpl could not be caught by upper level, we 
should exit in case that user can not be aware that.

The exception we wanna catch is 
[here](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java#L122),
 and I try to fix it refer to 
[MR](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java#L320).

Author: WangTaoTheTonic wangtao...@huawei.com

Closes #6717 from WangTaoTheTonic/SPARK-8273 and squashes the following commits:

28752d6 [WangTaoTheTonic] catch the throwed exception


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5014d0ed
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5014d0ed
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5014d0ed

Branch: refs/heads/master
Commit: 5014d0ed7e2f69810654003f8dd38078b945cf05
Parents: cb871c4
Author: WangTaoTheTonic wangtao...@huawei.com
Authored: Wed Jun 10 13:34:19 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:35:51 2015 -0700

--
 yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala | 4 
 1 file changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5014d0ed/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
--
diff --git a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
index ec9402a..da1ec2a 100644
--- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
+++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
@@ -29,6 +29,7 @@ import scala.collection.JavaConversions._
 import scala.collection.mutable.{ArrayBuffer, HashMap, HashSet, ListBuffer, 
Map}
 import scala.reflect.runtime.universe
 import scala.util.{Try, Success, Failure}
+import scala.util.control.NonFatal
 
 import com.google.common.base.Charsets.UTF_8
 import com.google.common.base.Objects
@@ -826,6 +827,9 @@ private[spark] class Client(
   case e: ApplicationNotFoundException =
 logError(sApplication $appId not found.)
 return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED)
+  case NonFatal(e) =
+logError(sFailed to contact YARN for application $appId., e)
+return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED)
 }
   val state = report.getYarnApplicationState
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8290] spark class command builder need read SPARK_JAVA_OPTS and SPARK_DRIVER_MEMORY properly

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master 80043e9e7 - cb871c44c


[SPARK-8290] spark class command builder need read SPARK_JAVA_OPTS and 
SPARK_DRIVER_MEMORY properly

SPARK_JAVA_OPTS was missed in reconstructing the launcher part, we should add 
it back so process launched by spark-class could read it properly. And so does 
`SPARK_DRIVER_MEMORY`.

The missing part is 
[here](https://github.com/apache/spark/blob/1c30afdf94b27e1ad65df0735575306e65d148a1/bin/spark-class#L97).

Author: WangTaoTheTonic wangtao...@huawei.com
Author: Tao Wang wangtao...@huawei.com

Closes #6741 from WangTaoTheTonic/SPARK-8290 and squashes the following commits:

bd89f0f [Tao Wang] make sure the memory setting is right too
e313520 [WangTaoTheTonic] spark class command builder need read SPARK_JAVA_OPTS


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cb871c44
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cb871c44
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cb871c44

Branch: refs/heads/master
Commit: cb871c44c38a4c1575ed076389f14641afafad7d
Parents: 80043e9
Author: WangTaoTheTonic wangtao...@huawei.com
Authored: Wed Jun 10 13:30:16 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:30:16 2015 -0700

--
 .../java/org/apache/spark/launcher/SparkClassCommandBuilder.java  | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cb871c44/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java
--
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java
 
b/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java
index d80abf2..de85720 100644
--- 
a/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java
+++ 
b/launcher/src/main/java/org/apache/spark/launcher/SparkClassCommandBuilder.java
@@ -93,6 +93,9 @@ class SparkClassCommandBuilder extends AbstractCommandBuilder 
{
 toolsDir.getAbsolutePath(), className);
 
   javaOptsKeys.add(SPARK_JAVA_OPTS);
+} else {
+  javaOptsKeys.add(SPARK_JAVA_OPTS);
+  memKey = SPARK_DRIVER_MEMORY;
 }
 
 ListString cmd = buildJavaCommand(extraClassPath);


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8273] Driver hangs up when yarn shutdown in client mode

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 568d1d51d - 2846a357f


[SPARK-8273] Driver hangs up when yarn shutdown in client mode

In client mode, if yarn was shut down with spark application running, the 
application will hang up after several retries(default: 30) because the 
exception throwed by YarnClientImpl could not be caught by upper level, we 
should exit in case that user can not be aware that.

The exception we wanna catch is 
[here](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java#L122),
 and I try to fix it refer to 
[MR](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java#L320).

Author: WangTaoTheTonic wangtao...@huawei.com

Closes #6717 from WangTaoTheTonic/SPARK-8273 and squashes the following commits:

28752d6 [WangTaoTheTonic] catch the throwed exception


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2846a357
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2846a357
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2846a357

Branch: refs/heads/branch-1.4
Commit: 2846a357f32bfa129bc37f4d1cbe9e19caaf69c9
Parents: 568d1d5
Author: WangTaoTheTonic wangtao...@huawei.com
Authored: Wed Jun 10 13:34:19 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:36:16 2015 -0700

--
 yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala | 4 
 1 file changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2846a357/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
--
diff --git a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
index f4d4321..9296e79 100644
--- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
+++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
@@ -28,6 +28,7 @@ import scala.collection.JavaConversions._
 import scala.collection.mutable.{ArrayBuffer, HashMap, HashSet, ListBuffer, 
Map}
 import scala.reflect.runtime.universe
 import scala.util.{Try, Success, Failure}
+import scala.util.control.NonFatal
 
 import com.google.common.base.Objects
 import com.google.common.io.Files
@@ -771,6 +772,9 @@ private[spark] class Client(
   case e: ApplicationNotFoundException =
 logError(sApplication $appId not found.)
 return (YarnApplicationState.KILLED, FinalApplicationStatus.KILLED)
+  case NonFatal(e) =
+logError(sFailed to contact YARN for application $appId., e)
+return (YarnApplicationState.FAILED, FinalApplicationStatus.FAILED)
 }
   val state = report.getYarnApplicationState
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-5479] [YARN] Handle --py-files correctly in YARN.

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master 8f7308f9c - 38112905b


[SPARK-5479] [YARN] Handle --py-files correctly in YARN.

The bug description is a little misleading: the actual issue is that
.py files are not handled correctly when distributed by YARN. They're
added to spark.submit.pyFiles, which, when processed by context.py,
explicitly whitelists certain extensions (see PACKAGE_EXTENSIONS),
and that does not include .py files.

On top of that, archives were not handled at all! They made it to the
driver's python path, but never made it to executors, since the mechanism
used to propagate their location (spark.submit.pyFiles) only works on
the driver side.

So, instead, ignore spark.submit.pyFiles and just build PYTHONPATH
correctly for both driver and executors. Individual .py files are
placed in a subdirectory of the container's local dir in the cluster,
which is then added to the python path. Archives are added directly.

The change, as a side effect, ends up solving the symptom described
in the bug. The issue was not that the files were not being distributed,
but that they were never made visible to the python application
running under Spark.

Also included is a proper unit test for running python on YARN, which
broke in several different ways with the previous code.

A short walk around of the changes:
- SparkSubmit does not try to be smart about how YARN handles python
  files anymore. It just passes down the configs to the YARN client
  code.
- The YARN client distributes python files and archives differently,
  placing the files in a subdirectory.
- The YARN client now sets PYTHONPATH for the processes it launches;
  to properly handle different locations, it uses YARN's support for
  embedding env variables, so to avoid YARN expanding those at the
  wrong time, SparkConf is now propagated to the AM using a conf file
  instead of command line options.
- Because the Client initialization code is a maze of implicit
  dependencies, some code needed to be moved around to make sure
  all needed state was available when the code ran.
- The pyspark tests in YarnClusterSuite now actually distribute and try
  to use both a python file and an archive containing a different python
  module. Also added a yarn-client tests for completeness.
- I cleaned up some of the code around distributing files to YARN, to
  avoid adding more copied  pasted code to handle the new files being
  distributed.

Author: Marcelo Vanzin van...@cloudera.com

Closes #6360 from vanzin/SPARK-5479 and squashes the following commits:

bcaf7e6 [Marcelo Vanzin] Feedback.
c47501f [Marcelo Vanzin] Fix yarn-client mode.
46b1d0c [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
c743778 [Marcelo Vanzin] Only pyspark cares about python archives.
c8e5a82 [Marcelo Vanzin] Actually run pyspark in client mode.
705571d [Marcelo Vanzin] Move some code to the YARN module.
1dd4d0c [Marcelo Vanzin] Review feedback.
71ee736 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
220358b [Marcelo Vanzin] Scalastyle.
cdbb990 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
7fe3cd4 [Marcelo Vanzin] No need to distribute primary file to executors.
09045f1 [Marcelo Vanzin] Style.
943cbf4 [Marcelo Vanzin] [SPARK-5479] [yarn] Handle --py-files correctly in 
YARN.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/38112905
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/38112905
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/38112905

Branch: refs/heads/master
Commit: 38112905bc3b33f2ae75274afba1c30e116f6e46
Parents: 8f7308f
Author: Marcelo Vanzin van...@cloudera.com
Authored: Wed Jun 10 13:17:29 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:17:29 2015 -0700

--
 .../org/apache/spark/deploy/SparkSubmit.scala   |  77 ++---
 .../spark/deploy/yarn/ApplicationMaster.scala   |  20 +-
 .../yarn/ApplicationMasterArguments.scala   |  12 +-
 .../org/apache/spark/deploy/yarn/Client.scala   | 295 ---
 .../spark/deploy/yarn/ClientArguments.scala |   4 +-
 .../cluster/YarnClientSchedulerBackend.scala|   5 +-
 .../apache/spark/deploy/yarn/ClientSuite.scala  |   4 +-
 .../spark/deploy/yarn/YarnClusterSuite.scala|  61 ++--
 8 files changed, 270 insertions(+), 208 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/38112905/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index a0eae77..b8978e2 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

spark git commit: [SPARK-7527] [CORE] Fix createNullValue to return the correct null values and REPL mode detection

2015-06-10 Thread andrewor14

Repository: spark
Updated Branches:
  refs/heads/master 19e30b48f - e90c9d92d


[SPARK-7527] [CORE] Fix createNullValue to return the correct null values and 
REPL mode detection

The root cause of SPARK-7527 is `createNullValue` returns an incompatible value 
`Byte(0)` for `char` and `boolean`.

This PR fixes it and corrects the class name of the main class, and also adds 
an unit test to demonstrate it.

Author: zsxwing zsxw...@gmail.com

Closes #6735 from zsxwing/SPARK-7527 and squashes the following commits:

bbdb271 [zsxwing] Use pattern match in createNullValue
b0a0e7e [zsxwing] Remove the noisy in the test output
903e269 [zsxwing] Remove the code for Utils.isInInterpreter == false
5f92dc1 [zsxwing] Fix createNullValue to return the correct null values and 
REPL mode detection


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e90c9d92
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e90c9d92
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e90c9d92

Branch: refs/heads/master
Commit: e90c9d92d9a86e9960c10a5c043f3c02f6c636f9
Parents: 19e30b4
Author: zsxwing zsxw...@gmail.com
Authored: Wed Jun 10 13:22:52 2015 -0700
Committer: Andrew Or and...@databricks.com
Committed: Wed Jun 10 13:24:02 2015 -0700

--
 .../org/apache/spark/util/ClosureCleaner.scala  | 40 --
 .../scala/org/apache/spark/util/Utils.scala |  9 +---
 .../apache/spark/util/ClosureCleanerSuite.scala | 44 
 3 files changed, 64 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e90c9d92/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala 
b/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala
index 6f2966b..305de4c 100644
--- a/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala
+++ b/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala
@@ -109,7 +109,14 @@ private[spark] object ClosureCleaner extends Logging {
 
   private def createNullValue(cls: Class[_]): AnyRef = {
 if (cls.isPrimitive) {
-  new java.lang.Byte(0: Byte) // Should be convertible to any primitive 
type
+  cls match {
+case java.lang.Boolean.TYPE = new java.lang.Boolean(false)
+case java.lang.Character.TYPE = new java.lang.Character('\0')
+case java.lang.Void.TYPE =
+  // This should not happen because `Foo(void x) {}` does not compile.
+  throw new IllegalStateException(Unexpected void parameter in 
constructor)
+case _ = new java.lang.Byte(0: Byte)
+  }
 } else {
   null
 }
@@ -319,28 +326,17 @@ private[spark] object ClosureCleaner extends Logging {
   private def instantiateClass(
   cls: Class[_],
   enclosingObject: AnyRef): AnyRef = {
-if (!Utils.isInInterpreter) {
-  // This is a bona fide closure class, whose constructor has no effects
-  // other than to set its fields, so use its constructor
-  val cons = cls.getConstructors()(0)
-  val params = cons.getParameterTypes.map(createNullValue).toArray
-  if (enclosingObject != null) {
-params(0) = enclosingObject // First param is always enclosing object
-  }
-  return cons.newInstance(params: _*).asInstanceOf[AnyRef]
-} else {
-  // Use reflection to instantiate object without calling constructor
-  val rf = sun.reflect.ReflectionFactory.getReflectionFactory()
-  val parentCtor = classOf[java.lang.Object].getDeclaredConstructor()
-  val newCtor = rf.newConstructorForSerialization(cls, parentCtor)
-  val obj = newCtor.newInstance().asInstanceOf[AnyRef]
-  if (enclosingObject != null) {
-val field = cls.getDeclaredField($outer)
-field.setAccessible(true)
-field.set(obj, enclosingObject)
-  }
-  obj
+// Use reflection to instantiate object without calling constructor
+val rf = sun.reflect.ReflectionFactory.getReflectionFactory()
+val parentCtor = classOf[java.lang.Object].getDeclaredConstructor()
+val newCtor = rf.newConstructorForSerialization(cls, parentCtor)
+val obj = newCtor.newInstance().asInstanceOf[AnyRef]
+if (enclosingObject != null) {
+  val field = cls.getDeclaredField($outer)
+  field.setAccessible(true)
+  field.set(obj, enclosingObject)
 }
+obj
   }
 }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e90c9d92/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 153ece6..19157af 100644
---

spark git commit: [SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 2b550a521 - 8f7308f9c


[SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

Author: Cheng Lian l...@databricks.com

Closes #6749 from liancheng/java-sample-fix and squashes the following commits:

5b44585 [Cheng Lian] Fixes a minor Java example error in SQL programming guide


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8f7308f9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8f7308f9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8f7308f9

Branch: refs/heads/master
Commit: 8f7308f9c49805b9486aaae5f60e4481e8ba24e8
Parents: 2b550a5
Author: Cheng Lian l...@databricks.com
Authored: Wed Jun 10 11:48:14 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 11:48:14 2015 -0700

--
 docs/sql-programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8f7308f9/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index 40e33f7..c5ab074 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1479,7 +1479,7 @@ expressed in HiveQL.
 
 {% highlight java %}
 // sc is an existing JavaSparkContext.
-HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
+HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc);
 
 sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING));
 sqlContext.sql(LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' 
INTO TABLE src);


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 a0a7f2f92 - 7b88e6a1e


[SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

Author: Cheng Lian l...@databricks.com

Closes #6749 from liancheng/java-sample-fix and squashes the following commits:

5b44585 [Cheng Lian] Fixes a minor Java example error in SQL programming guide

(cherry picked from commit 8f7308f9c49805b9486aaae5f60e4481e8ba24e8)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7b88e6a1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7b88e6a1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7b88e6a1

Branch: refs/heads/branch-1.4
Commit: 7b88e6a1e3b8f79cb41842bc21893760dc4b74e6
Parents: a0a7f2f
Author: Cheng Lian l...@databricks.com
Authored: Wed Jun 10 11:48:14 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 11:48:19 2015 -0700

--
 docs/sql-programming-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7b88e6a1/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index cde5830..17f2954 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1475,7 +1475,7 @@ expressed in HiveQL.
 
 {% highlight java %}
 // sc is an existing JavaSparkContext.
-HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
+HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc);
 
 sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING));
 sqlContext.sql(LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' 
INTO TABLE src);


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7996] Deprecate the developer api SparkEnv.actorSystem

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master c6ba7cca3 - 2b550a521


[SPARK-7996] Deprecate the developer api SparkEnv.actorSystem

Changed ```SparkEnv.actorSystem``` to be a function such that we can use the 
deprecated flag with it and added a deprecated message.

Author: Ilya Ganelin ilya.gane...@capitalone.com

Closes #6731 from ilganeli/SPARK-7996 and squashes the following commits:

be43817 [Ilya Ganelin] Restored to val
9ed89e7 [Ilya Ganelin] Added a version info for deprecation
9610b08 [Ilya Ganelin] Converted actorSystem to function and added deprecated 
flag


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2b550a52
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2b550a52
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2b550a52

Branch: refs/heads/master
Commit: 2b550a521e45e1dbca2cca40ddd94e20c013831c
Parents: c6ba7cc
Author: Ilya Ganelin ilya.gane...@capitalone.com
Authored: Wed Jun 10 11:21:12 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 11:21:12 2015 -0700

--
 core/src/main/scala/org/apache/spark/SparkEnv.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2b550a52/core/src/main/scala/org/apache/spark/SparkEnv.scala
--
diff --git a/core/src/main/scala/org/apache/spark/SparkEnv.scala 
b/core/src/main/scala/org/apache/spark/SparkEnv.scala
index a185954..b066557 100644
--- a/core/src/main/scala/org/apache/spark/SparkEnv.scala
+++ b/core/src/main/scala/org/apache/spark/SparkEnv.scala
@@ -20,6 +20,8 @@ package org.apache.spark
 import java.io.File
 import java.net.Socket
 
+import akka.actor.ActorSystem
+
 import scala.collection.JavaConversions._
 import scala.collection.mutable
 import scala.util.Properties
@@ -75,7 +77,8 @@ class SparkEnv (
 val conf: SparkConf) extends Logging {
 
   // TODO Remove actorSystem
-  val actorSystem = rpcEnv.asInstanceOf[AkkaRpcEnv].actorSystem
+  @deprecated(Actor system is no longer supported as of 1.4)
+  val actorSystem: ActorSystem = rpcEnv.asInstanceOf[AkkaRpcEnv].actorSystem
 
   private[spark] var isStopped = false
   private val pythonWorkers = mutable.HashMap[(String, Map[String, String]), 
PythonWorkerFactory]()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

2015-06-10 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 2846a357f - 59fc3f197


[SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

Test cases for both StreamingLinearRegression and StreamingLogisticRegression, 
and code fix.

Edit:
This contribution is my original work and I license the work to the project 
under the project's open source license.

Author: Paavo ppark...@gmail.com

Closes #6713 from pparkkin/streamingmodel-empty-rdd and squashes the following 
commits:

ff5cd78 [Paavo] Update strings to use interpolation.
db234cf [Paavo] Use !rdd.isEmpty.
54ad89e [Paavo] Test case for empty stream.
393e36f [Paavo] Ignore empty RDDs.
0bfc365 [Paavo] Test case for empty stream.

(cherry picked from commit b928f543845ddd39e914a0e8f0b0205fd86100c5)
Signed-off-by: Sean Owen so...@cloudera.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/59fc3f19
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/59fc3f19
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/59fc3f19

Branch: refs/heads/branch-1.4
Commit: 59fc3f197247c6c8c40ea7479573af023c89d718
Parents: 2846a35
Author: Paavo ppark...@gmail.com
Authored: Wed Jun 10 23:17:42 2015 +0100
Committer: Sean Owen so...@cloudera.com
Committed: Wed Jun 10 23:26:54 2015 +0100

--
 .../regression/StreamingLinearAlgorithm.scala   | 28 +++-
 .../StreamingLogisticRegressionSuite.scala  | 17 
 .../StreamingLinearRegressionSuite.scala| 18 +
 3 files changed, 50 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/59fc3f19/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
 
b/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
index cea8f3f..2dd8aca 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
@@ -83,21 +83,23 @@ abstract class StreamingLinearAlgorithm[
   throw new IllegalArgumentException(Model must be initialized before 
starting training.)
 }
 data.foreachRDD { (rdd, time) =
-  val initialWeights =
-model match {
-  case Some(m) =
-m.weights
-  case None =
-val numFeatures = rdd.first().features.size
-Vectors.dense(numFeatures)
+  if (!rdd.isEmpty) {
+val initialWeights =
+  model match {
+case Some(m) =
+  m.weights
+case None =
+  val numFeatures = rdd.first().features.size
+  Vectors.dense(numFeatures)
+  }
+model = Some(algorithm.run(rdd, initialWeights))
+logInfo(sModel updated at time ${time.toString})
+val display = model.get.weights.size match {
+  case x if x  100 = 
model.get.weights.toArray.take(100).mkString([, ,, ...)
+  case _ = model.get.weights.toArray.mkString([, ,, ])
 }
-  model = Some(algorithm.run(rdd, initialWeights))
-  logInfo(Model updated at time %s.format(time.toString))
-  val display = model.get.weights.size match {
-case x if x  100 = model.get.weights.toArray.take(100).mkString([, 
,, ...)
-case _ = model.get.weights.toArray.mkString([, ,, ])
+logInfo(sCurrent model: weights, ${display})
   }
-  logInfo(Current model: weights, %s.format (display))
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/59fc3f19/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
index e98b61e..fd65329 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
@@ -158,4 +158,21 @@ class StreamingLogisticRegressionSuite extends 
SparkFunSuite with TestSuiteBase
 val error = output.map(batch = batch.map(p = math.abs(p._1 - p._2)).sum 
/ nPoints).toList
 assert(error.head  0.8  error.last  0.2)
   }
+
+  // Test empty RDDs in a stream
+  test(handling empty RDDs in a stream) {
+val model = new StreamingLogisticRegressionWithSGD()
+  .setInitialWeights(Vectors.dense(-0.1))
+  .setStepSize(0.01)
+

spark git commit: [SPARK-2774] Set preferred locations for reduce tasks

2015-06-10 Thread kayousterhout

Repository: spark
Updated Branches:
  refs/heads/master 5014d0ed7 - 96a7c888d


[SPARK-2774] Set preferred locations for reduce tasks

Set preferred locations for reduce tasks.
The basic design is that we maintain a map from reducerId to a list of (sizes, 
locations) for each
shuffle. We then set the preferred locations to be any machines that have 20% 
of more of the output
that needs to be read by the reduce task.  This will result in at most 5 
preferred locations for
each reduce task.

Selecting the preferred locations involves O(# map tasks * # reduce tasks) 
computation, so we
restrict this feature to cases where we have fewer than 1000 map tasks and 1000 
reduce tasks.

Author: Shivaram Venkataraman shiva...@cs.berkeley.edu

Closes #6652 from shivaram/reduce-locations and squashes the following commits:

492e25e [Shivaram Venkataraman] Remove unused import
2ef2d39 [Shivaram Venkataraman] Address code review comments
897a914 [Shivaram Venkataraman] Remove unused hash map
f5be578 [Shivaram Venkataraman] Use fraction of map outputs to determine 
locations Also removes caching of preferred locations to make the API cleaner
68bc29e [Shivaram Venkataraman] Fix line length
1090b58 [Shivaram Venkataraman] Change flag name
77ce7d8 [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
e5d56bd [Shivaram Venkataraman] Add flag to turn off locality for shuffle deps
6cfae98 [Shivaram Venkataraman] Filter out zero blocks, rename variables
9d5831a [Shivaram Venkataraman] Address some more comments
8e31266 [Shivaram Venkataraman] Fix style
0df3180 [Shivaram Venkataraman] Address code review comments
e7d5449 [Shivaram Venkataraman] Fix merge issues
ad7cb53 [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
df14cee [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
5093aea [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
0171d3c [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
bc4dfd6 [Shivaram Venkataraman] Merge branch 'master' of 
https://github.com/apache/spark into reduce-locations
774751b [Shivaram Venkataraman] Fix bug introduced by line length adjustment
34d0283 [Shivaram Venkataraman] Fix style issues
3b464b7 [Shivaram Venkataraman] Set preferred locations for reduce tasks This 
is another attempt at #1697 addressing some of the earlier concerns. This adds 
a couple of thresholds based on number map and reduce tasks beyond which we 
don't use preferred locations for reduce tasks.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/96a7c888
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/96a7c888
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/96a7c888

Branch: refs/heads/master
Commit: 96a7c888d806adfdb2c722025a1079ed7eaa2052
Parents: 5014d0e
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Authored: Wed Jun 10 15:03:40 2015 -0700
Committer: Kay Ousterhout kayousterh...@gmail.com
Committed: Wed Jun 10 15:04:38 2015 -0700

--
 .../org/apache/spark/MapOutputTracker.scala | 49 -
 .../apache/spark/scheduler/DAGScheduler.scala   | 37 +-
 .../apache/spark/MapOutputTrackerSuite.scala| 35 +
 .../spark/scheduler/DAGSchedulerSuite.scala | 76 +++-
 4 files changed, 177 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/96a7c888/core/src/main/scala/org/apache/spark/MapOutputTracker.scala
--
diff --git a/core/src/main/scala/org/apache/spark/MapOutputTracker.scala 
b/core/src/main/scala/org/apache/spark/MapOutputTracker.scala
index 0184228..862ffe8 100644
--- a/core/src/main/scala/org/apache/spark/MapOutputTracker.scala
+++ b/core/src/main/scala/org/apache/spark/MapOutputTracker.scala
@@ -21,7 +21,7 @@ import java.io._
 import java.util.concurrent.ConcurrentHashMap
 import java.util.zip.{GZIPInputStream, GZIPOutputStream}
 
-import scala.collection.mutable.{HashSet, Map}
+import scala.collection.mutable.{HashMap, HashSet, Map}
 import scala.collection.JavaConversions._
 import scala.reflect.ClassTag
 
@@ -284,6 +284,53 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
 cachedSerializedStatuses.contains(shuffleId) || 
mapStatuses.contains(shuffleId)
   }
 
+  /**
+   * Return a list of locations that each have fraction of map output greater 
than the specified
+   * threshold.
+   *
+   * @param shuffleId id of the shuffle
+   * @param reducerId id of the reduce task
+   * @param numReducers total number of reducers in the shuffle
+   * @param fractionThreshold

spark git commit: [SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

2015-06-10 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 96a7c888d - b928f5438


[SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

Test cases for both StreamingLinearRegression and StreamingLogisticRegression, 
and code fix.

Edit:
This contribution is my original work and I license the work to the project 
under the project's open source license.

Author: Paavo ppark...@gmail.com

Closes #6713 from pparkkin/streamingmodel-empty-rdd and squashes the following 
commits:

ff5cd78 [Paavo] Update strings to use interpolation.
db234cf [Paavo] Use !rdd.isEmpty.
54ad89e [Paavo] Test case for empty stream.
393e36f [Paavo] Ignore empty RDDs.
0bfc365 [Paavo] Test case for empty stream.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b928f543
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b928f543
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b928f543

Branch: refs/heads/master
Commit: b928f543845ddd39e914a0e8f0b0205fd86100c5
Parents: 96a7c88
Author: Paavo ppark...@gmail.com
Authored: Wed Jun 10 23:17:42 2015 +0100
Committer: Sean Owen so...@cloudera.com
Committed: Wed Jun 10 23:17:42 2015 +0100

--
 .../regression/StreamingLinearAlgorithm.scala | 14 --
 .../StreamingLogisticRegressionSuite.scala| 17 +
 .../StreamingLinearRegressionSuite.scala  | 18 ++
 3 files changed, 43 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b928f543/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
 
b/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
index aee51bf..141052b 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/regression/StreamingLinearAlgorithm.scala
@@ -83,13 +83,15 @@ abstract class StreamingLinearAlgorithm[
   throw new IllegalArgumentException(Model must be initialized before 
starting training.)
 }
 data.foreachRDD { (rdd, time) =
-  model = Some(algorithm.run(rdd, model.get.weights))
-  logInfo(Model updated at time %s.format(time.toString))
-  val display = model.get.weights.size match {
-case x if x  100 = model.get.weights.toArray.take(100).mkString([, 
,, ...)
-case _ = model.get.weights.toArray.mkString([, ,, ])
+  if (!rdd.isEmpty) {
+model = Some(algorithm.run(rdd, model.get.weights))
+logInfo(sModel updated at time ${time.toString})
+val display = model.get.weights.size match {
+  case x if x  100 = 
model.get.weights.toArray.take(100).mkString([, ,, ...)
+  case _ = model.get.weights.toArray.mkString([, ,, ])
+}
+logInfo(sCurrent model: weights, ${display})
   }
-  logInfo(Current model: weights, %s.format (display))
 }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/b928f543/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
index e98b61e..fd65329 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/classification/StreamingLogisticRegressionSuite.scala
@@ -158,4 +158,21 @@ class StreamingLogisticRegressionSuite extends 
SparkFunSuite with TestSuiteBase
 val error = output.map(batch = batch.map(p = math.abs(p._1 - p._2)).sum 
/ nPoints).toList
 assert(error.head  0.8  error.last  0.2)
   }
+
+  // Test empty RDDs in a stream
+  test(handling empty RDDs in a stream) {
+val model = new StreamingLogisticRegressionWithSGD()
+  .setInitialWeights(Vectors.dense(-0.1))
+  .setStepSize(0.01)
+  .setNumIterations(10)
+val numBatches = 10
+val emptyInput = Seq.empty[Seq[LabeledPoint]]
+val ssc = setupStreams(emptyInput,
+  (inputDStream: DStream[LabeledPoint]) = {
+model.trainOn(inputDStream)
+model.predictOnValues(inputDStream.map(x = (x.label, x.features)))
+  }
+)
+val output: Seq[Seq[(Double, Double)]] = runStreams(ssc, numBatches, 
numBatches)
+  }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b928f543/mllib/src/test/scala/org/apache/spark/mllib/regression/StreamingLinearRegressionSuite.scala

spark git commit: [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/branch-1.4 59fc3f197 - 5c05b5c0d


[SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

case cs  CombineSum(expr) =
val calcType = expr.dataType
  expr.dataType match {
case DecimalType.Fixed(_, _) =
  DecimalType.Unlimited
case _ =
  expr.dataType
  }
calcType is always expr.dataType. credits are all belong to IntelliJ

Author: navis.ryu na...@apache.org

Closes #6736 from navis/SPARK-8285 and squashes the following commits:

20382c1 [navis.ryu] [SPARK-8285] [SQL] CombineSum should be calculated as 
unlimited decimal first

(cherry picked from commit 6a47114bc297f0bce874e425feb1c24a5c26cef0)
Signed-off-by: Reynold Xin r...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5c05b5c0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5c05b5c0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5c05b5c0

Branch: refs/heads/branch-1.4
Commit: 5c05b5c0d25fc902bf95ed7b93ad7b5775631150
Parents: 59fc3f1
Author: navis.ryu na...@apache.org
Authored: Wed Jun 10 18:19:12 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 18:19:24 2015 -0700

--
 .../org/apache/spark/sql/execution/GeneratedAggregate.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5c05b5c0/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
index 3e27c1b..af37917 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
@@ -118,7 +118,7 @@ case class GeneratedAggregate(
 AggregateEvaluation(currentSum :: Nil, initialValue :: Nil, 
updateFunction :: Nil, result)
 
   case cs @ CombineSum(expr) =
-val calcType = expr.dataType
+val calcType =
   expr.dataType match {
 case DecimalType.Fixed(_, _) =
   DecimalType.Unlimited
@@ -129,7 +129,7 @@ case class GeneratedAggregate(
 val currentSum = AttributeReference(currentSum, calcType, nullable = 
true)()
 val initialValue = Literal.create(null, calcType)
 
-// Coalasce avoids double calculation...
+// Coalesce avoids double calculation...
 // but really, common sub expression elimination would be better
 val zero = Cast(Literal(0), calcType)
 // If we're evaluating UnscaledValue(x), we can do Count on x 
directly, since its


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 37719e0cd - 6a47114bc


[SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

case cs  CombineSum(expr) =
val calcType = expr.dataType
  expr.dataType match {
case DecimalType.Fixed(_, _) =
  DecimalType.Unlimited
case _ =
  expr.dataType
  }
calcType is always expr.dataType. credits are all belong to IntelliJ

Author: navis.ryu na...@apache.org

Closes #6736 from navis/SPARK-8285 and squashes the following commits:

20382c1 [navis.ryu] [SPARK-8285] [SQL] CombineSum should be calculated as 
unlimited decimal first


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6a47114b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6a47114b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6a47114b

Branch: refs/heads/master
Commit: 6a47114bc297f0bce874e425feb1c24a5c26cef0
Parents: 37719e0
Author: navis.ryu na...@apache.org
Authored: Wed Jun 10 18:19:12 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 18:19:12 2015 -0700

--
 .../org/apache/spark/sql/execution/GeneratedAggregate.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6a47114b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
index 3e27c1b..af37917 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
@@ -118,7 +118,7 @@ case class GeneratedAggregate(
 AggregateEvaluation(currentSum :: Nil, initialValue :: Nil, 
updateFunction :: Nil, result)
 
   case cs @ CombineSum(expr) =
-val calcType = expr.dataType
+val calcType =
   expr.dataType match {
 case DecimalType.Fixed(_, _) =
   DecimalType.Unlimited
@@ -129,7 +129,7 @@ case class GeneratedAggregate(
 val currentSum = AttributeReference(currentSum, calcType, nullable = 
true)()
 val initialValue = Literal.create(null, calcType)
 
-// Coalasce avoids double calculation...
+// Coalesce avoids double calculation...
 // but really, common sub expression elimination would be better
 val zero = Cast(Literal(0), calcType)
 // If we're evaluating UnscaledValue(x), we can do Count on x 
directly, since its


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r9331 - /dev/spark/spark-1.4.0-rc4/ /release/spark/spark-1.4.0/

2015-06-10 Thread pwendell

Author: pwendell
Date: Wed Jun 10 23:18:01 2015
New Revision: 9331

Log:
Adding Spark release 1.4.0


Added:
release/spark/spark-1.4.0/
  - copied from r9330, dev/spark/spark-1.4.0-rc4/
Removed:
dev/spark/spark-1.4.0-rc4/


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8164] transformExpressions should support nested expression sequence

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 6a47114bc - 4e42842e8


[SPARK-8164] transformExpressions should support nested expression sequence

Currently we only support `Seq[Expression]`, we should handle cases like 
`Seq[Seq[Expression]]` so that we can remove the unnecessary `GroupExpression`.

Author: Wenchen Fan cloud0...@outlook.com

Closes #6706 from cloud-fan/clean and squashes the following commits:

60a1193 [Wenchen Fan] support nested expression sequence and remove 
GroupExpression


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4e42842e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4e42842e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4e42842e

Branch: refs/heads/master
Commit: 4e42842e82e058d54329bd66185d8a7e77ab335a
Parents: 6a47114
Author: Wenchen Fan cloud0...@outlook.com
Authored: Wed Jun 10 18:22:47 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 18:22:47 2015 -0700

--
 .../spark/sql/catalyst/analysis/Analyzer.scala  |  6 +++---
 .../sql/catalyst/expressions/Expression.scala   | 12 ---
 .../spark/sql/catalyst/plans/QueryPlan.scala| 22 +---
 .../catalyst/plans/logical/basicOperators.scala |  2 +-
 .../sql/catalyst/trees/TreeNodeSuite.scala  | 14 +
 .../org/apache/spark/sql/execution/Expand.scala |  4 ++--
 6 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4e42842e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index c4f12cf..cbd8def 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -172,8 +172,8 @@ class Analyzer(
  * expressions which equal GroupBy expressions with Literal(null), if 
those expressions
  * are not set for this grouping set (according to the bit mask).
  */
-private[this] def expand(g: GroupingSets): Seq[GroupExpression] = {
-  val result = new scala.collection.mutable.ArrayBuffer[GroupExpression]
+private[this] def expand(g: GroupingSets): Seq[Seq[Expression]] = {
+  val result = new scala.collection.mutable.ArrayBuffer[Seq[Expression]]
 
   g.bitmasks.foreach { bitmask =
 // get the non selected grouping attributes according to the bit mask
@@ -194,7 +194,7 @@ class Analyzer(
 Literal.create(bitmask, IntegerType)
 })
 
-result += GroupExpression(substitution)
+result += substitution
   }
 
   result.toSeq

http://git-wip-us.apache.org/repos/asf/spark/blob/4e42842e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
index a05794f..63dd5f9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
@@ -239,18 +239,6 @@ abstract class UnaryExpression extends Expression with 
trees.UnaryNode[Expressio
   }
 }
 
-// TODO Semantically we probably not need GroupExpression
-// All we need is holding the Seq[Expression], and ONLY used in doing the
-// expressions transformation correctly. Probably will be removed since it's
-// not like a real expressions.
-case class GroupExpression(children: Seq[Expression]) extends Expression {
-  self: Product =
-  override def eval(input: Row): Any = throw new UnsupportedOperationException
-  override def nullable: Boolean = false
-  override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
-}
-
 /**
  * Expressions that require a specific `DataType` as input should implement 
this trait
  * so that the proper type conversions can be performed in the analyzer.

http://git-wip-us.apache.org/repos/asf/spark/blob/4e42842e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
index eff5c61..2f545bb 100644
---

svn commit: r9330 - /dev/spark/spark-1.4.0-rc4/

2015-06-10 Thread pwendell

Author: pwendell
Date: Wed Jun 10 23:16:15 2015
New Revision: 9330

Log:
Adding Spark 1.4.0 RC4

Added:
dev/spark/spark-1.4.0-rc4/
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz   (with 
props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.asc   (with 
props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.3.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.3.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.3.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.3.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.4.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.4.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.4.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.4.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.6.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.6.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.6.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop2.6.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-without-hadoop.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-without-hadoop.tgz.asc   (with 
props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-without-hadoop.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-without-hadoop.tgz.sha
dev/spark/spark-1.4.0-rc4/spark-1.4.0.tgz   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0.tgz.asc   (with props)
dev/spark/spark-1.4.0-rc4/spark-1.4.0.tgz.md5
dev/spark/spark-1.4.0-rc4/spark-1.4.0.tgz.sha

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz
==
Binary file - no diff available.

Propchange: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz
--
svn:mime-type = application/x-gzip

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.asc
==
Binary file - no diff available.

Propchange: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.asc
--
svn:mime-type = application/pgp-signature

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.md5
==
--- dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.md5 (added)
+++ dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.md5 Wed Jun 10 23:16:15 
2015
@@ -0,0 +1 @@
+spark-1.4.0-bin-cdh4.tgz: 25 98 11 6C 28 0B 27 87  FE B1 47 CB C9 77 91 90

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.sha
==
--- dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.sha (added)
+++ dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-cdh4.tgz.sha Wed Jun 10 23:16:15 
2015
@@ -0,0 +1,3 @@
+spark-1.4.0-bin-cdh4.tgz: 94C53911 6FE567AE 79408A91 8637FD9C 0909A737 E7C8C359
+  89541574 993CD003 691F94EF 104B6530 B2BCC23C 536F088A
+  6B3729A8 C5C97DA4 1FED6041 DF9D9918

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz
==
Binary file - no diff available.

Propchange: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz
--
svn:mime-type = application/x-gzip

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.asc
==
Binary file - no diff available.

Propchange: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.asc
--
svn:mime-type = application/pgp-signature

Added: dev/spark/spark-1.4.0-rc4/spark-1.4.0-bin-hadoop1-scala2.11.tgz.md5
==
---

spark git commit: [SPARK-8189] [SQL] use Long for TimestampType in SQL

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master b928f5438 - 37719e0cd


[SPARK-8189] [SQL] use Long for TimestampType in SQL

This PR change to use Long as internal type for TimestampType for efficiency, 
which means it will the precision below 100ns.

Author: Davies Liu dav...@databricks.com

Closes #6733 from davies/timestamp and squashes the following commits:

d9565fa [Davies Liu] remove print
65cf2f1 [Davies Liu] fix Timestamp in SparkR
86fecfb [Davies Liu] disable two timestamp tests
8f77ee0 [Davies Liu] fix scala style
246ee74 [Davies Liu] address comments
309d2e1 [Davies Liu] use Long for TimestampType in SQL


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/37719e0c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/37719e0c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/37719e0c

Branch: refs/heads/master
Commit: 37719e0cd0b00cc5ffee0ebe1652d465a574db0f
Parents: b928f54
Author: Davies Liu dav...@databricks.com
Authored: Wed Jun 10 16:55:39 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 16:55:39 2015 -0700

--
 .../scala/org/apache/spark/api/r/SerDe.scala| 17 --
 python/pyspark/sql/types.py | 11 
 .../scala/org/apache/spark/sql/BaseRow.java |  6 ++
 .../main/scala/org/apache/spark/sql/Row.scala   |  8 ++-
 .../sql/catalyst/CatalystTypeConverters.scala   | 13 +++-
 .../spark/sql/catalyst/expressions/Cast.scala   | 62 +---
 .../expressions/SpecificMutableRow.scala|  1 +
 .../expressions/codegen/CodeGenerator.scala |  4 +-
 .../codegen/GenerateProjection.scala| 10 +++-
 .../sql/catalyst/expressions/literals.scala | 15 +++--
 .../sql/catalyst/expressions/predicates.scala   |  6 +-
 .../spark/sql/catalyst/util/DateUtils.scala | 44 +++---
 .../apache/spark/sql/types/TimestampType.scala  | 10 +---
 .../sql/catalyst/expressions/CastSuite.scala| 11 ++--
 .../sql/catalyst/util/DateUtilsSuite.scala  | 40 +
 .../apache/spark/sql/types/DataTypeSuite.scala  |  2 +-
 .../apache/spark/sql/columnar/ColumnStats.scala | 21 +--
 .../apache/spark/sql/columnar/ColumnType.scala  | 19 +++---
 .../sql/execution/SparkSqlSerializer2.scala | 17 ++
 .../spark/sql/execution/debug/package.scala |  2 +
 .../apache/spark/sql/execution/pythonUdfs.scala |  7 ++-
 .../org/apache/spark/sql/jdbc/JDBCRDD.scala | 10 +++-
 .../apache/spark/sql/json/JacksonParser.scala   |  5 +-
 .../org/apache/spark/sql/json/JsonRDD.scala | 10 ++--
 .../spark/sql/parquet/ParquetConverter.scala|  9 +--
 .../spark/sql/parquet/ParquetTableSupport.scala | 10 ++--
 .../org/apache/spark/sql/CachedTableSuite.scala |  2 +-
 .../spark/sql/columnar/ColumnStatsSuite.scala   |  2 +-
 .../spark/sql/columnar/ColumnTypeSuite.scala| 11 ++--
 .../spark/sql/columnar/ColumnarTestUtils.scala  |  9 +--
 .../org/apache/spark/sql/jdbc/JDBCSuite.scala   |  2 +-
 .../org/apache/spark/sql/json/JsonSuite.scala   | 14 +++--
 .../hive/execution/HiveCompatibilitySuite.scala |  8 ++-
 .../apache/spark/sql/hive/HiveInspectors.scala  | 20 ---
 .../org/apache/spark/sql/hive/TableReader.scala |  4 +-
 ...p cast #5-0-dbd7bcd167d322d6617b884c02c7f247 |  2 +-
 36 files changed, 272 insertions(+), 172 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/37719e0c/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala 
b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
index f8e3f1a..56adc85 100644
--- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
+++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.api.r
 
 import java.io.{DataInputStream, DataOutputStream}
-import java.sql.{Date, Time}
+import java.sql.{Timestamp, Date, Time}
 
 import scala.collection.JavaConversions._
 
@@ -107,9 +107,12 @@ private[spark] object SerDe {
 Date.valueOf(readString(in))
   }
 
-  def readTime(in: DataInputStream): Time = {
-val t = in.readDouble()
-new Time((t * 1000L).toLong)
+  def readTime(in: DataInputStream): Timestamp = {
+val seconds = in.readDouble()
+val sec = Math.floor(seconds).toLong
+val t = new Timestamp(sec * 1000L)
+t.setNanos(((seconds - sec) * 1e9).toInt)
+t
   }
 
   def readBytesArr(in: DataInputStream): Array[Array[Byte]] = {
@@ -227,6 +230,9 @@ private[spark] object SerDe {
 case java.sql.Time =
   writeType(dos, time)
   writeTime(dos, value.asInstanceOf[Time])
+case java.sql.Timestamp =
+  writeType(dos, time)
+  writeTime(dos, value.asInstanceOf[Timestamp])
 case [B =

spark git commit: [HOTFIX] Fixing errors in name mappings

2015-06-10 Thread pwendell

Repository: spark
Updated Branches:
  refs/heads/master a777eb04b - e84545fa7


[HOTFIX] Fixing errors in name mappings


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e84545fa
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e84545fa
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e84545fa

Branch: refs/heads/master
Commit: e84545fa771dde90de5675a9c551fe287af6f7fb
Parents: a777eb0
Author: Patrick Wendell patr...@databricks.com
Authored: Wed Jun 10 22:56:36 2015 -0700
Committer: Patrick Wendell patr...@databricks.com
Committed: Wed Jun 10 22:56:36 2015 -0700

--
 dev/create-release/known_translations | 4 
 1 file changed, 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e84545fa/dev/create-release/known_translations
--
diff --git a/dev/create-release/known_translations 
b/dev/create-release/known_translations
index bbd4330..5f2671a 100644
--- a/dev/create-release/known_translations
+++ b/dev/create-release/known_translations
@@ -97,8 +97,6 @@ FavioVazquez - Favio Vazquez
 JaysonSunshine - Jayson Sunshine
 Liuchang0812 - Liu Chang
 Sephiroth-Lin - Sephiroth Lin
-baishuo - Cheng Lian
-daisukebe - Shixiong Zhu
 dobashim - Masaru Dobashi
 ehnalis - Zoltan Zvara
 emres - Emre Sevinc
@@ -122,11 +120,9 @@ nyaapa - Arsenii Krasikov
 phatak-dev - Madhukara Phatak
 prabeesh - Prabeesh K
 rakeshchalasani - Rakesh Chalasani
-raschild - Marcelo Vanzin
 rekhajoshm - Rekha Joshi
 sisihj - June He
 szheng79 - Shuai Zheng
-ted-yu - Andrew Or
 texasmichelle - Michelle Casbon
 vinodkc - Vinod KC
 yongtang - Yong Tang


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-8248][SQL] string function: length

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 4e42842e8 - 9fe3adcce


[SPARK-8248][SQL] string function: length

Author: Cheng Hao hao.ch...@intel.com

Closes #6724 from chenghao-intel/length and squashes the following commits:

aaa3c31 [Cheng Hao] revert the additional change
97148a9 [Cheng Hao] remove the codegen testing temporally
ae08003 [Cheng Hao] update the comments
1eb1fd1 [Cheng Hao] simplify the code as commented
3e92d32 [Cheng Hao] use the selectExpr in unit test intead of SQLQuery
3c729aa [Cheng Hao] fix bug for constant null value in codegen
3641f06 [Cheng Hao] keep the length() method for registered function
8e30171 [Cheng Hao] update the code as comment
db604ae [Cheng Hao] Add code gen support
548d2ef [Cheng Hao] register the length()
09a0738 [Cheng Hao] add length support


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9fe3adcc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9fe3adcc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9fe3adcc

Branch: refs/heads/master
Commit: 9fe3adccef687c92ff1ac17d946af089c8e28d66
Parents: 4e42842
Author: Cheng Hao hao.ch...@intel.com
Authored: Wed Jun 10 19:55:10 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 19:55:10 2015 -0700

--
 .../catalyst/analysis/FunctionRegistry.scala| 13 +++-
 .../sql/catalyst/expressions/Expression.scala   |  3 +++
 .../catalyst/expressions/stringOperations.scala | 21 
 .../expressions/StringFunctionsSuite.scala  | 12 +++
 .../scala/org/apache/spark/sql/functions.scala  | 18 +
 .../spark/sql/DataFrameFunctionsSuite.scala | 20 +++
 6 files changed, 82 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9fe3adcc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index ba89a5c..39875d7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -89,14 +89,10 @@ object FunctionRegistry {
 expression[CreateArray](array),
 expression[Coalesce](coalesce),
 expression[Explode](explode),
-expression[Lower](lower),
-expression[Substring](substr),
-expression[Substring](substring),
 expression[Rand](rand),
 expression[Randn](randn),
 expression[CreateStruct](struct),
 expression[Sqrt](sqrt),
-expression[Upper](upper),
 
 // Math functions
 expression[Acos](acos),
@@ -132,7 +128,14 @@ object FunctionRegistry {
 expression[Last](last),
 expression[Max](max),
 expression[Min](min),
-expression[Sum](sum)
+expression[Sum](sum),
+
+// string functions
+expression[Lower](lower),
+expression[StringLength](length),
+expression[Substring](substr),
+expression[Substring](substring),
+expression[Upper](upper)
   )
 
   val builtin: FunctionRegistry = {

http://git-wip-us.apache.org/repos/asf/spark/blob/9fe3adcc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
index 63dd5f9..8c1e4d7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala
@@ -212,6 +212,9 @@ abstract class LeafExpression extends Expression with 
trees.LeafNode[Expression]
 abstract class UnaryExpression extends Expression with 
trees.UnaryNode[Expression] {
   self: Product =
 
+  override def foldable: Boolean = child.foldable
+  override def nullable: Boolean = child.nullable
+
   /**
* Called by unary expressions to generate a code block that returns null if 
its parent returns
* null, and if not not null, use `f` to generate the expression.

http://git-wip-us.apache.org/repos/asf/spark/blob/9fe3adcc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala

spark git commit: [SPARK-8217] [SQL] math function log2

2015-06-10 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 9fe3adcce - 2758ff0a9


[SPARK-8217] [SQL] math function log2

Author: Daoyuan Wang daoyuan.w...@intel.com

This patch had conflicts when merged, resolved by
Committer: Reynold Xin r...@databricks.com

Closes #6718 from adrian-wang/udflog2 and squashes the following commits:

3909f48 [Daoyuan Wang] math function: log2


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2758ff0a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2758ff0a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2758ff0a

Branch: refs/heads/master
Commit: 2758ff0a96f03a61e10999b2462acf7a13236b7c
Parents: 9fe3adc
Author: Daoyuan Wang daoyuan.w...@intel.com
Authored: Wed Jun 10 20:22:32 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Wed Jun 10 20:22:32 2015 -0700

--
 .../catalyst/analysis/FunctionRegistry.scala|  1 +
 .../spark/sql/catalyst/expressions/math.scala   | 17 +
 .../expressions/MathFunctionsSuite.scala|  6 ++
 .../scala/org/apache/spark/sql/functions.scala  | 20 ++--
 .../spark/sql/DataFrameFunctionsSuite.scala | 12 
 5 files changed, 54 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2758ff0a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 39875d7..a7816e3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -111,6 +111,7 @@ object FunctionRegistry {
 expression[Log10](log10),
 expression[Log1p](log1p),
 expression[Pi](pi),
+expression[Log2](log2),
 expression[Pow](pow),
 expression[Rint](rint),
 expression[Signum](signum),

http://git-wip-us.apache.org/repos/asf/spark/blob/2758ff0a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
index e1d8c9a..97e960b 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
@@ -161,6 +161,23 @@ case class Floor(child: Expression) extends 
UnaryMathExpression(math.floor, FLO
 
 case class Log(child: Expression) extends UnaryMathExpression(math.log, LOG)
 
+case class Log2(child: Expression)
+  extends UnaryMathExpression((x: Double) = math.log(x) / math.log(2), 
LOG2) {
+  override def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): 
String = {
+val eval = child.gen(ctx)
+eval.code + s
+  boolean ${ev.isNull} = ${eval.isNull};
+  ${ctx.javaType(dataType)} ${ev.primitive} = 
${ctx.defaultValue(dataType)};
+  if (!${ev.isNull}) {
+${ev.primitive} = java.lang.Math.log(${eval.primitive}) / 
java.lang.Math.log(2);
+if (Double.valueOf(${ev.primitive}).isNaN()) {
+  ${ev.isNull} = true;
+}
+  }
+
+  }
+}
+
 case class Log10(child: Expression) extends UnaryMathExpression(math.log10, 
LOG10)
 
 case class Log1p(child: Expression) extends UnaryMathExpression(math.log1p, 
LOG1P)

http://git-wip-us.apache.org/repos/asf/spark/blob/2758ff0a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
index 1fe6905..864c954 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
@@ -185,6 +185,12 @@ class MathFunctionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 testUnary(Log1p, math.log1p, (-10 to -2).map(_ * 1.0), expectNull = true)
   }
 
+  test(log2) {
+def f: (Double) = Double = (x: Double) = math.log(x) / math.log(2)
+testUnary(Log2, f, (0 to 20).map(_ * 0.1))
+testUnary(Log2, f, (-5 to -1).map(_ * 1.0), expectNull = true)
+  }
+
   test(pow) {
 testBinary(Pow,

spark git commit: [HOTFIX] Adding more contributor name bindings

2015-06-10 Thread pwendell

Repository: spark
Updated Branches:
  refs/heads/master 2758ff0a9 - a777eb04b


[HOTFIX] Adding more contributor name bindings


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a777eb04
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a777eb04
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a777eb04

Branch: refs/heads/master
Commit: a777eb04bf981312b640326607158f78dd4163cd
Parents: 2758ff0
Author: Patrick Wendell patr...@databricks.com
Authored: Wed Jun 10 21:13:47 2015 -0700
Committer: Patrick Wendell patr...@databricks.com
Committed: Wed Jun 10 21:14:03 2015 -0700

--
 dev/create-release/known_translations | 42 ++
 1 file changed, 42 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a777eb04/dev/create-release/known_translations
--
diff --git a/dev/create-release/known_translations 
b/dev/create-release/known_translations
index 0a599b5..bbd4330 100644
--- a/dev/create-release/known_translations
+++ b/dev/create-release/known_translations
@@ -91,3 +91,45 @@ zapletal-martin - Martin Zapletal
 zuxqoj - Shekhar Bansal
 mingyukim - Mingyu Kim
 sigmoidanalytics - Mayur Rustagi
+AiHe - Ai He
+BenFradet - Ben Fradet
+FavioVazquez - Favio Vazquez
+JaysonSunshine - Jayson Sunshine
+Liuchang0812 - Liu Chang
+Sephiroth-Lin - Sephiroth Lin
+baishuo - Cheng Lian
+daisukebe - Shixiong Zhu
+dobashim - Masaru Dobashi
+ehnalis - Zoltan Zvara
+emres - Emre Sevinc
+gchen - Guancheng Chen
+haiyangsea - Haiyang Sea
+hlin09 - Hao Lin
+hqzizania - Qian Huang
+jeanlyn - Jean Lyn
+jerluc - Jeremy A. Lucas
+jrabary - Jaonary Rabarisoa
+judynash - Judy Nash
+kaka1992 - Chen Song
+ksonj - Kalle Jepsen
+kuromatsu-nobuyuki - Nobuyuki Kuromatsu
+lazyman500 - Dong Xu
+leahmcguire - Leah McGuire
+mbittmann - Mark Bittmann
+mbonaci - Marko Bonaci
+meawoppl - Matthew Goodman
+nyaapa - Arsenii Krasikov
+phatak-dev - Madhukara Phatak
+prabeesh - Prabeesh K
+rakeshchalasani - Rakesh Chalasani
+raschild - Marcelo Vanzin
+rekhajoshm - Rekha Joshi
+sisihj - June He
+szheng79 - Shuai Zheng
+ted-yu - Andrew Or
+texasmichelle - Michelle Casbon
+vinodkc - Vinod KC
+yongtang - Yong Tang
+ypcat - Pei-Lun Lee
+zhichao-li - Zhichao Li
+zzcclp - Zhichao Zhang


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-06-10 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc4 [deleted] 22596c534

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-06-10 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.4.0 [created] 22596c534

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-06-10 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc2 [deleted] 03fb26a3e

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-06-10 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc1 [deleted] 777a08166

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2015-06-10 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.4.0-rc3 [deleted] dd109a874

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-7886] Added unit test for HAVING aggregate pushdown.

spark git commit: [SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.

spark git commit: [SPARK-8215] [SPARK-8212] [SQL] add leaf math expression for e and pi

spark git commit: [SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

spark git commit: [SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable

spark git commit: [SPARK-7756] CORE RDDOperationScope fix for IBM Java

spark git commit: [SPARK-7756] CORE RDDOperationScope fix for IBM Java

spark git commit: [SPARK-7261] [CORE] Change default log level to WARN in the REPL

spark git commit: [SPARK-8273] Driver hangs up when yarn shutdown in client mode

spark git commit: [SPARK-8290] spark class command builder need read SPARK_JAVA_OPTS and SPARK_DRIVER_MEMORY properly

spark git commit: [SPARK-8273] Driver hangs up when yarn shutdown in client mode

spark git commit: [SPARK-5479] [YARN] Handle --py-files correctly in YARN.

spark git commit: [SPARK-7527] [CORE] Fix createNullValue to return the correct null values and REPL mode detection

spark git commit: [SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

spark git commit: [SQL] [MINOR] Fixes a minor Java example error in SQL programming guide

spark git commit: [SPARK-7996] Deprecate the developer api SparkEnv.actorSystem

spark git commit: [SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

spark git commit: [SPARK-2774] Set preferred locations for reduce tasks

spark git commit: [SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm

spark git commit: [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

spark git commit: [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first

svn commit: r9331 - /dev/spark/spark-1.4.0-rc4/ /release/spark/spark-1.4.0/

spark git commit: [SPARK-8164] transformExpressions should support nested expression sequence

svn commit: r9330 - /dev/spark/spark-1.4.0-rc4/

spark git commit: [SPARK-8189] [SQL] use Long for TimestampType in SQL

spark git commit: [HOTFIX] Fixing errors in name mappings

spark git commit: [SPARK-8248][SQL] string function: length

spark git commit: [SPARK-8217] [SQL] math function log2

spark git commit: [HOTFIX] Adding more contributor name bindings

Git Push Summary

Git Push Summary

Git Push Summary

Git Push Summary

Git Push Summary

34 matches

Site Navigation

Mail list logo

Footer information