spark git commit: [SPARK-7137] [ML] Update SchemaUtils checkInputColumn to print more info if needed
Repository: spark Updated Branches: refs/heads/master 2b820f2a4 - f9c448dce [SPARK-7137] [ML] Update SchemaUtils checkInputColumn to print more info if needed Author: Joshi rekhajo...@gmail.com Author: Rekha Joshi rekhajo...@gmail.com Closes #5992 from rekhajoshm/fix/SPARK-7137 and squashes the following commits: 8c42b57 [Joshi] update checkInputColumn to print more info if needed 33ddd2e [Joshi] update checkInputColumn to print more info if needed acf3e17 [Joshi] update checkInputColumn to print more info if needed 8993c0e [Joshi] SPARK-7137: Add checkInputColumn back to Params and print more info e3677c9 [Rekha Joshi] Merge pull request #1 from apache/master Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f9c448dc Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f9c448dc Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f9c448dc Branch: refs/heads/master Commit: f9c448dce8139e85ac564daa0f7e0325e778cffe Parents: 2b820f2 Author: Joshi rekhajo...@gmail.com Authored: Sun Jul 5 12:58:03 2015 -0700 Committer: Joseph K. Bradley jos...@databricks.com Committed: Sun Jul 5 12:58:03 2015 -0700 -- .../main/scala/org/apache/spark/ml/util/SchemaUtils.scala | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/f9c448dc/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala b/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala index 7cd53c6..76f6514 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala @@ -32,10 +32,15 @@ private[spark] object SchemaUtils { * @param colName column name * @param dataType required column data type */ - def checkColumnType(schema: StructType, colName: String, dataType: DataType): Unit = { + def checkColumnType( + schema: StructType, + colName: String, + dataType: DataType, + msg: String = ): Unit = { val actualDataType = schema(colName).dataType +val message = if (msg != null msg.trim.length 0) + msg else require(actualDataType.equals(dataType), - sColumn $colName must be of type $dataType but was actually $actualDataType.) + sColumn $colName must be of type $dataType but was actually $actualDataType.$message) } /** - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r1689292 - /spark/js/downloads.js
Author: lian Date: Sun Jul 5 21:36:13 2015 New Revision: 1689292 URL: http://svn.apache.org/r1689292 Log: Bumps old version threshold to 1.2.0 Modified: spark/js/downloads.js Modified: spark/js/downloads.js URL: http://svn.apache.org/viewvc/spark/js/downloads.js?rev=1689292r1=1689291r2=1689292view=diff == --- spark/js/downloads.js (original) +++ spark/js/downloads.js Sun Jul 5 21:36:13 2015 @@ -171,7 +171,7 @@ function updateDownloadLink() { if (pkg.toLowerCase().indexOf(mapr) -1) { link = http://package.mapr.com/tools/apache-spark/$ver/$artifact; } else if (download == apache) { -if (version = 1.0.0) { +if (version = 1.2.0) { link = http://archive.apache.org/dist/spark/spark-$ver/$artifact;; } else { link = http://www.apache.org/dyn/closer.cgi/spark/spark-$ver/$artifact;; - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r1689293 - /spark/site/js/downloads.js
Author: lian Date: Sun Jul 5 21:40:14 2015 New Revision: 1689293 URL: http://svn.apache.org/r1689293 Log: Checks in updated site/ directory Modified: spark/site/js/downloads.js Modified: spark/site/js/downloads.js URL: http://svn.apache.org/viewvc/spark/site/js/downloads.js?rev=1689293r1=1689292r2=1689293view=diff == --- spark/site/js/downloads.js (original) +++ spark/site/js/downloads.js Sun Jul 5 21:40:14 2015 @@ -171,7 +171,7 @@ function updateDownloadLink() { if (pkg.toLowerCase().indexOf(mapr) -1) { link = http://package.mapr.com/tools/apache-spark/$ver/$artifact; } else if (download == apache) { -if (version = 1.0.0) { +if (version = 1.2.0) { link = http://archive.apache.org/dist/spark/spark-$ver/$artifact;; } else { link = http://www.apache.org/dyn/closer.cgi/spark/spark-$ver/$artifact;; - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SQL][Minor] Update the DataFrame API for encode/decode
Repository: spark Updated Branches: refs/heads/master a0cb111b2 - 6d0411b4f [SQL][Minor] Update the DataFrame API for encode/decode This is a the follow up of #6843. Author: Cheng Hao hao.ch...@intel.com Closes #7230 from chenghao-intel/str_funcs2_followup and squashes the following commits: 52cc553 [Cheng Hao] update the code as comment Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6d0411b4 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6d0411b4 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6d0411b4 Branch: refs/heads/master Commit: 6d0411b4f3a202cfb53f638ee5fd49072b42d3a6 Parents: a0cb111 Author: Cheng Hao hao.ch...@intel.com Authored: Sun Jul 5 21:50:52 2015 -0700 Committer: Reynold Xin r...@databricks.com Committed: Sun Jul 5 21:50:52 2015 -0700 -- .../catalyst/expressions/stringOperations.scala | 21 ++-- .../scala/org/apache/spark/sql/functions.scala | 14 +++-- .../spark/sql/DataFrameFunctionsSuite.scala | 8 ++-- 3 files changed, 25 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6d0411b4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala index 6de4062..1a14a7a 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala @@ -392,12 +392,13 @@ case class UnBase64(child: Expression) extends UnaryExpression with ExpectsInput /** * Decodes the first argument into a String using the provided character set * (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). - * If either argument is null, the result will also be null. (As of Hive 0.12.0.). + * If either argument is null, the result will also be null. */ -case class Decode(bin: Expression, charset: Expression) extends Expression with ExpectsInputTypes { - override def children: Seq[Expression] = bin :: charset :: Nil - override def foldable: Boolean = bin.foldable charset.foldable - override def nullable: Boolean = bin.nullable || charset.nullable +case class Decode(bin: Expression, charset: Expression) + extends BinaryExpression with ExpectsInputTypes { + + override def left: Expression = bin + override def right: Expression = charset override def dataType: DataType = StringType override def inputTypes: Seq[DataType] = Seq(BinaryType, StringType) @@ -420,13 +421,13 @@ case class Decode(bin: Expression, charset: Expression) extends Expression with /** * Encodes the first argument into a BINARY using the provided character set * (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). - * If either argument is null, the result will also be null. (As of Hive 0.12.0.) + * If either argument is null, the result will also be null. */ case class Encode(value: Expression, charset: Expression) - extends Expression with ExpectsInputTypes { - override def children: Seq[Expression] = value :: charset :: Nil - override def foldable: Boolean = value.foldable charset.foldable - override def nullable: Boolean = value.nullable || charset.nullable + extends BinaryExpression with ExpectsInputTypes { + + override def left: Expression = value + override def right: Expression = charset override def dataType: DataType = BinaryType override def inputTypes: Seq[DataType] = Seq(StringType, StringType) http://git-wip-us.apache.org/repos/asf/spark/blob/6d0411b4/sql/core/src/main/scala/org/apache/spark/sql/functions.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala index abcfc0b..f802917 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala @@ -1666,18 +1666,19 @@ object functions { * @group string_funcs * @since 1.5.0 */ - def encode(value: Column, charset: Column): Column = Encode(value.expr, charset.expr) + def encode(value: Column, charset: String): Column = Encode(value.expr, lit(charset).expr) /** * Computes the first argument into a binary from a string using the provided character set * (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). * If either argument is null, the result will also be null. + *
spark git commit: [SPARK-8549] [SPARKR] Fix the line length of SparkR
Repository: spark Updated Branches: refs/heads/master f9c448dce - a0cb111b2 [SPARK-8549] [SPARKR] Fix the line length of SparkR [[SPARK-8549] Fix the line length of SparkR - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-8549) Author: Yu ISHIKAWA yuu.ishik...@gmail.com Closes #7204 from yu-iskw/SPARK-8549 and squashes the following commits: 6fb131a [Yu ISHIKAWA] Fix the typo 1737598 [Yu ISHIKAWA] [SPARK-8549][SparkR] Fix the line length of SparkR Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a0cb111b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a0cb111b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a0cb111b Branch: refs/heads/master Commit: a0cb111b22cb093e86b0daeecb3dcc41d095df40 Parents: f9c448d Author: Yu ISHIKAWA yuu.ishik...@gmail.com Authored: Sun Jul 5 20:50:02 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Sun Jul 5 20:50:02 2015 -0700 -- R/pkg/R/generics.R | 3 ++- R/pkg/R/pairRDD.R | 12 ++-- R/pkg/R/sparkR.R | 9 ++--- R/pkg/R/utils.R| 31 ++- R/pkg/inst/tests/test_includeJAR.R | 4 ++-- R/pkg/inst/tests/test_rdd.R| 12 R/pkg/inst/tests/test_sparkSQL.R | 11 +-- 7 files changed, 51 insertions(+), 31 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a0cb111b/R/pkg/R/generics.R -- diff --git a/R/pkg/R/generics.R b/R/pkg/R/generics.R index 79055b7..fad9d71 100644 --- a/R/pkg/R/generics.R +++ b/R/pkg/R/generics.R @@ -20,7 +20,8 @@ # @rdname aggregateRDD # @seealso reduce # @export -setGeneric(aggregateRDD, function(x, zeroValue, seqOp, combOp) { standardGeneric(aggregateRDD) }) +setGeneric(aggregateRDD, + function(x, zeroValue, seqOp, combOp) { standardGeneric(aggregateRDD) }) # @rdname cache-methods # @export http://git-wip-us.apache.org/repos/asf/spark/blob/a0cb111b/R/pkg/R/pairRDD.R -- diff --git a/R/pkg/R/pairRDD.R b/R/pkg/R/pairRDD.R index 7f902ba..0f1179e 100644 --- a/R/pkg/R/pairRDD.R +++ b/R/pkg/R/pairRDD.R @@ -560,8 +560,8 @@ setMethod(join, # Left outer join two RDDs # # @description -# \code{leftouterjoin} This function left-outer-joins two RDDs where every element is of the form list(K, V). -# The key types of the two RDDs should be the same. +# \code{leftouterjoin} This function left-outer-joins two RDDs where every element is of +# the form list(K, V). The key types of the two RDDs should be the same. # # @param x An RDD to be joined. Should be an RDD where each element is # list(K, V). @@ -597,8 +597,8 @@ setMethod(leftOuterJoin, # Right outer join two RDDs # # @description -# \code{rightouterjoin} This function right-outer-joins two RDDs where every element is of the form list(K, V). -# The key types of the two RDDs should be the same. +# \code{rightouterjoin} This function right-outer-joins two RDDs where every element is of +# the form list(K, V). The key types of the two RDDs should be the same. # # @param x An RDD to be joined. Should be an RDD where each element is # list(K, V). @@ -634,8 +634,8 @@ setMethod(rightOuterJoin, # Full outer join two RDDs # # @description -# \code{fullouterjoin} This function full-outer-joins two RDDs where every element is of the form list(K, V). -# The key types of the two RDDs should be the same. +# \code{fullouterjoin} This function full-outer-joins two RDDs where every element is of +# the form list(K, V). The key types of the two RDDs should be the same. # # @param x An RDD to be joined. Should be an RDD where each element is # list(K, V). http://git-wip-us.apache.org/repos/asf/spark/blob/a0cb111b/R/pkg/R/sparkR.R -- diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R index 86233e0..048eb8e 100644 --- a/R/pkg/R/sparkR.R +++ b/R/pkg/R/sparkR.R @@ -105,7 +105,8 @@ sparkR.init - function( sparkPackages = ) { if (exists(.sparkRjsc, envir = .sparkREnv)) { -cat(Re-using existing Spark Context. Please stop SparkR with sparkR.stop() or restart R to create a new Spark Context\n) +cat(paste(Re-using existing Spark Context., + Please stop SparkR with sparkR.stop() or restart R to create a new Spark Context\n)) return(get(.sparkRjsc, envir = .sparkREnv)) } @@ -180,14 +181,16 @@ sparkR.init - function( sparkExecutorEnvMap - new.env() if (!any(names(sparkExecutorEnv) == LD_LIBRARY_PATH)) { -sparkExecutorEnvMap[[LD_LIBRARY_PATH]] - paste0($LD_LIBRARY_PATH:,Sys.getenv(LD_LIBRARY_PATH)) +