spark git commit: [MINOR] [SQL] Fix sphinx warnings in PySpark SQL
Repository: spark Updated Branches: refs/heads/branch-1.5 5be517584 - 257e9d727 [MINOR] [SQL] Fix sphinx warnings in PySpark SQL Author: MechCoder manojkumarsivaraj...@gmail.com Closes #8171 from MechCoder/sql_sphinx. (cherry picked from commit 52c60537a274af5414f6b0340a4bd7488ef35280) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/257e9d72 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/257e9d72 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/257e9d72 Branch: refs/heads/branch-1.5 Commit: 257e9d727874332fd192f6a993f9ea8bf464abf5 Parents: 5be5175 Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Thu Aug 20 10:05:31 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 10:05:39 2015 -0700 -- python/pyspark/context.py | 8 python/pyspark/sql/types.py | 4 +++- 2 files changed, 7 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/257e9d72/python/pyspark/context.py -- diff --git a/python/pyspark/context.py b/python/pyspark/context.py index eb5b0bb..1b2a52a 100644 --- a/python/pyspark/context.py +++ b/python/pyspark/context.py @@ -302,10 +302,10 @@ class SparkContext(object): A unique identifier for the Spark application. Its format depends on the scheduler implementation. -(i.e. -in case of local spark app something like 'local-1433865536131' -in case of YARN something like 'application_1433865536131_34483' -) + +* in case of local spark app something like 'local-1433865536131' +* in case of YARN something like 'application_1433865536131_34483' + sc.applicationId # doctest: +ELLIPSIS u'local-...' http://git-wip-us.apache.org/repos/asf/spark/blob/257e9d72/python/pyspark/sql/types.py -- diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py index c083bf8..ed4e5b5 100644 --- a/python/pyspark/sql/types.py +++ b/python/pyspark/sql/types.py @@ -467,9 +467,11 @@ class StructType(DataType): Construct a StructType by adding new elements to it to define the schema. The method accepts either: + a) A single parameter which is a StructField object. b) Between 2 and 4 parameters as (name, data_type, nullable (optional), - metadata(optional). The data_type parameter may be either a String or a DataType object + metadata(optional). The data_type parameter may be either a String or a + DataType object. struct1 = StructType().add(f1, StringType(), True).add(f2, StringType(), True, None) struct2 = StructType([StructField(f1, StringType(), True),\ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [MINOR] [SQL] Fix sphinx warnings in PySpark SQL
Repository: spark Updated Branches: refs/heads/master b4f4e91c3 - 52c60537a [MINOR] [SQL] Fix sphinx warnings in PySpark SQL Author: MechCoder manojkumarsivaraj...@gmail.com Closes #8171 from MechCoder/sql_sphinx. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/52c60537 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/52c60537 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/52c60537 Branch: refs/heads/master Commit: 52c60537a274af5414f6b0340a4bd7488ef35280 Parents: b4f4e91 Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Thu Aug 20 10:05:31 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 10:05:31 2015 -0700 -- python/pyspark/context.py | 8 python/pyspark/sql/types.py | 4 +++- 2 files changed, 7 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/52c60537/python/pyspark/context.py -- diff --git a/python/pyspark/context.py b/python/pyspark/context.py index eb5b0bb..1b2a52a 100644 --- a/python/pyspark/context.py +++ b/python/pyspark/context.py @@ -302,10 +302,10 @@ class SparkContext(object): A unique identifier for the Spark application. Its format depends on the scheduler implementation. -(i.e. -in case of local spark app something like 'local-1433865536131' -in case of YARN something like 'application_1433865536131_34483' -) + +* in case of local spark app something like 'local-1433865536131' +* in case of YARN something like 'application_1433865536131_34483' + sc.applicationId # doctest: +ELLIPSIS u'local-...' http://git-wip-us.apache.org/repos/asf/spark/blob/52c60537/python/pyspark/sql/types.py -- diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py index c083bf8..ed4e5b5 100644 --- a/python/pyspark/sql/types.py +++ b/python/pyspark/sql/types.py @@ -467,9 +467,11 @@ class StructType(DataType): Construct a StructType by adding new elements to it to define the schema. The method accepts either: + a) A single parameter which is a StructField object. b) Between 2 and 4 parameters as (name, data_type, nullable (optional), - metadata(optional). The data_type parameter may be either a String or a DataType object + metadata(optional). The data_type parameter may be either a String or a + DataType object. struct1 = StructType().add(f1, StringType(), True).add(f2, StringType(), True, None) struct2 = StructType([StructField(f1, StringType(), True),\ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal type
Repository: spark Updated Branches: refs/heads/branch-1.5 257e9d727 - a7027e6d3 [SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal type Author: Alex Shkurenko ashkure...@enova.com Closes #8239 from ashkurenko/master. (cherry picked from commit 39e91fe2fd43044cc734d55625a3c03284b69f09) Signed-off-by: Shivaram Venkataraman shiva...@cs.berkeley.edu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a7027e6d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a7027e6d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a7027e6d Branch: refs/heads/branch-1.5 Commit: a7027e6d3369a1157c53557c8215273606086d84 Parents: 257e9d7 Author: Alex Shkurenko ashkure...@enova.com Authored: Thu Aug 20 10:16:38 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Thu Aug 20 10:16:57 2015 -0700 -- core/src/main/scala/org/apache/spark/api/r/SerDe.scala | 5 + 1 file changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a7027e6d/core/src/main/scala/org/apache/spark/api/r/SerDe.scala -- diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala index d5b4260..3c89f24 100644 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala @@ -181,6 +181,7 @@ private[spark] object SerDe { // Boolean - logical // Float - double // Double - double + // Decimal - double // Long - double // Array[Byte] - raw // Date - Date @@ -219,6 +220,10 @@ private[spark] object SerDe { case float | java.lang.Float = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Float].toDouble) +case decimal | java.math.BigDecimal = + writeType(dos, double) + val javaDecimal = value.asInstanceOf[java.math.BigDecimal] + writeDouble(dos, scala.math.BigDecimal(javaDecimal).toDouble) case double | java.lang.Double = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Double]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal type
Repository: spark Updated Branches: refs/heads/master 52c60537a - 39e91fe2f [SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal type Author: Alex Shkurenko ashkure...@enova.com Closes #8239 from ashkurenko/master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/39e91fe2 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/39e91fe2 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/39e91fe2 Branch: refs/heads/master Commit: 39e91fe2fd43044cc734d55625a3c03284b69f09 Parents: 52c6053 Author: Alex Shkurenko ashkure...@enova.com Authored: Thu Aug 20 10:16:38 2015 -0700 Committer: Shivaram Venkataraman shiva...@cs.berkeley.edu Committed: Thu Aug 20 10:16:38 2015 -0700 -- core/src/main/scala/org/apache/spark/api/r/SerDe.scala | 5 + 1 file changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/39e91fe2/core/src/main/scala/org/apache/spark/api/r/SerDe.scala -- diff --git a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala index d5b4260..3c89f24 100644 --- a/core/src/main/scala/org/apache/spark/api/r/SerDe.scala +++ b/core/src/main/scala/org/apache/spark/api/r/SerDe.scala @@ -181,6 +181,7 @@ private[spark] object SerDe { // Boolean - logical // Float - double // Double - double + // Decimal - double // Long - double // Array[Byte] - raw // Date - Date @@ -219,6 +220,10 @@ private[spark] object SerDe { case float | java.lang.Float = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Float].toDouble) +case decimal | java.math.BigDecimal = + writeType(dos, double) + val javaDecimal = value.asInstanceOf[java.math.BigDecimal] + writeDouble(dos, scala.math.BigDecimal(javaDecimal).toDouble) case double | java.lang.Double = writeType(dos, double) writeDouble(dos, value.asInstanceOf[Double]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10092] [SQL] Multi-DB support follow up.
Repository: spark Updated Branches: refs/heads/master b762f9920 - 43e013542 [SPARK-10092] [SQL] Multi-DB support follow up. https://issues.apache.org/jira/browse/SPARK-10092 This pr is a follow-up one for Multi-DB support. It has the following changes: * `HiveContext.refreshTable` now accepts `dbName.tableName`. * `HiveContext.analyze` now accepts `dbName.tableName`. * `CreateTableUsing`, `CreateTableUsingAsSelect`, `CreateTempTableUsing`, `CreateTempTableUsingAsSelect`, `CreateMetastoreDataSource`, and `CreateMetastoreDataSourceAsSelect` all take `TableIdentifier` instead of the string representation of table name. * When you call `saveAsTable` with a specified database, the data will be saved to the correct location. * Explicitly do not allow users to create a temporary with a specified database name (users cannot do it before). * When we save table to metastore, we also check if db name and table name can be accepted by hive (using `MetaStoreUtils.validateName`). Author: Yin Huai yh...@databricks.com Closes #8324 from yhuai/saveAsTableDB. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/43e01354 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/43e01354 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/43e01354 Branch: refs/heads/master Commit: 43e0135421b2262cbb0e06aae53523f663b4f959 Parents: b762f99 Author: Yin Huai yh...@databricks.com Authored: Thu Aug 20 15:30:31 2015 +0800 Committer: Cheng Lian l...@databricks.com Committed: Thu Aug 20 15:30:31 2015 +0800 -- .../spark/sql/catalyst/TableIdentifier.scala| 4 +- .../spark/sql/catalyst/analysis/Catalog.scala | 63 +--- .../org/apache/spark/sql/DataFrameWriter.scala | 2 +- .../scala/org/apache/spark/sql/SQLContext.scala | 15 +- .../spark/sql/execution/SparkStrategies.scala | 10 +- .../sql/execution/datasources/DDLParser.scala | 32 ++-- .../spark/sql/execution/datasources/ddl.scala | 22 +-- .../spark/sql/execution/datasources/rules.scala | 8 +- .../org/apache/spark/sql/SQLQuerySuite.scala| 35 .../org/apache/spark/sql/hive/HiveContext.scala | 14 +- .../spark/sql/hive/HiveMetastoreCatalog.scala | 22 ++- .../apache/spark/sql/hive/HiveStrategies.scala | 12 +- .../spark/sql/hive/execution/commands.scala | 54 +-- .../apache/spark/sql/hive/ListTablesSuite.scala | 6 - .../spark/sql/hive/MultiDatabaseSuite.scala | 158 ++- .../sql/hive/execution/SQLQuerySuite.scala | 35 16 files changed, 398 insertions(+), 94 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/43e01354/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala index aebcdeb..d701559 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala @@ -25,7 +25,9 @@ private[sql] case class TableIdentifier(table: String, database: Option[String] def toSeq: Seq[String] = database.toSeq :+ table - override def toString: String = toSeq.map(` + _ + `).mkString(.) + override def toString: String = quotedString + + def quotedString: String = toSeq.map(` + _ + `).mkString(.) def unquotedString: String = toSeq.mkString(.) } http://git-wip-us.apache.org/repos/asf/spark/blob/43e01354/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala index 5766e6a..503c4f4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala @@ -23,6 +23,7 @@ import scala.collection.JavaConversions._ import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.catalyst.{TableIdentifier, CatalystConf, EmptyConf} import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Subquery} @@ -55,12 +56,15 @@ trait Catalog { def refreshTable(tableIdent: TableIdentifier): Unit + // TODO: Refactor it in the work of SPARK-10104 def registerTable(tableIdentifier: Seq[String], plan: LogicalPlan): Unit + // TODO: Refactor it in the work of SPARK-10104 def unregisterTable(tableIdentifier:
spark git commit: [SPARK-10092] [SQL] Backports #8324 to branch-1.5
Repository: spark Updated Branches: refs/heads/branch-1.5 71aa54755 - 675e22494 [SPARK-10092] [SQL] Backports #8324 to branch-1.5 Author: Yin Huai yh...@databricks.com Closes #8336 from liancheng/spark-10092/for-branch-1.5. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/675e2249 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/675e2249 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/675e2249 Branch: refs/heads/branch-1.5 Commit: 675e2249472fbadecb5c8f8da6ae8ff7a1f05305 Parents: 71aa547 Author: Yin Huai yh...@databricks.com Authored: Thu Aug 20 18:43:24 2015 +0800 Committer: Cheng Lian l...@databricks.com Committed: Thu Aug 20 18:43:24 2015 +0800 -- .../spark/sql/catalyst/TableIdentifier.scala| 4 +- .../spark/sql/catalyst/analysis/Catalog.scala | 63 +--- .../org/apache/spark/sql/DataFrameWriter.scala | 2 +- .../scala/org/apache/spark/sql/SQLContext.scala | 15 +- .../spark/sql/execution/SparkStrategies.scala | 10 +- .../sql/execution/datasources/DDLParser.scala | 32 ++-- .../spark/sql/execution/datasources/ddl.scala | 22 +-- .../spark/sql/execution/datasources/rules.scala | 8 +- .../org/apache/spark/sql/SQLQuerySuite.scala| 35 .../org/apache/spark/sql/hive/HiveContext.scala | 14 +- .../spark/sql/hive/HiveMetastoreCatalog.scala | 22 ++- .../apache/spark/sql/hive/HiveStrategies.scala | 12 +- .../spark/sql/hive/execution/commands.scala | 54 +-- .../apache/spark/sql/hive/ListTablesSuite.scala | 6 - .../spark/sql/hive/MultiDatabaseSuite.scala | 158 ++- .../sql/hive/execution/SQLQuerySuite.scala | 35 16 files changed, 398 insertions(+), 94 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/675e2249/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala index aebcdeb..d701559 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala @@ -25,7 +25,9 @@ private[sql] case class TableIdentifier(table: String, database: Option[String] def toSeq: Seq[String] = database.toSeq :+ table - override def toString: String = toSeq.map(` + _ + `).mkString(.) + override def toString: String = quotedString + + def quotedString: String = toSeq.map(` + _ + `).mkString(.) def unquotedString: String = toSeq.mkString(.) } http://git-wip-us.apache.org/repos/asf/spark/blob/675e2249/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala index 5766e6a..503c4f4 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala @@ -23,6 +23,7 @@ import scala.collection.JavaConversions._ import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.catalyst.{TableIdentifier, CatalystConf, EmptyConf} import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Subquery} @@ -55,12 +56,15 @@ trait Catalog { def refreshTable(tableIdent: TableIdentifier): Unit + // TODO: Refactor it in the work of SPARK-10104 def registerTable(tableIdentifier: Seq[String], plan: LogicalPlan): Unit + // TODO: Refactor it in the work of SPARK-10104 def unregisterTable(tableIdentifier: Seq[String]): Unit def unregisterAllTables(): Unit + // TODO: Refactor it in the work of SPARK-10104 protected def processTableIdentifier(tableIdentifier: Seq[String]): Seq[String] = { if (conf.caseSensitiveAnalysis) { tableIdentifier @@ -69,6 +73,7 @@ trait Catalog { } } + // TODO: Refactor it in the work of SPARK-10104 protected def getDbTableName(tableIdent: Seq[String]): String = { val size = tableIdent.size if (size = 2) { @@ -78,9 +83,22 @@ trait Catalog { } } + // TODO: Refactor it in the work of SPARK-10104 protected def getDBTable(tableIdent: Seq[String]) : (Option[String], String) = { (tableIdent.lift(tableIdent.size - 2), tableIdent.last) } + + /** + * It is not allowed to specifiy database name for tables stored in [[SimpleCatalog]]. + * We use
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [created] 4c56ad772 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark git commit: Preparing Spark release v1.5.0-rc1
Repository: spark Updated Branches: refs/heads/branch-1.5 175c1d9c9 - 988e838a2 Preparing Spark release v1.5.0-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4c56ad77 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4c56ad77 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4c56ad77 Branch: refs/heads/branch-1.5 Commit: 4c56ad772637615cc1f4f88d619fac6c372c8552 Parents: 175c1d9 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 16:24:07 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 16:24:07 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index e9c6d26..3ef7d6f 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index ed5c37e..684e07b 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 4f79d71..6b082ad 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index e6884b0..9ef1eda 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index e05e431..fe878e6 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/4c56ad77/external/flume-sink/pom.xml -- diff --git
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [deleted] d837d51d5 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10140] [DOC] add target fields to @Since
Repository: spark Updated Branches: refs/heads/master afe9f03fd - cdd9a2bb1 [SPARK-10140] [DOC] add target fields to @Since so constructors parameters and public fields can be annotated. rxin MechCoder Author: Xiangrui Meng m...@databricks.com Closes #8344 from mengxr/SPARK-10140.2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cdd9a2bb Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cdd9a2bb Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cdd9a2bb Branch: refs/heads/master Commit: cdd9a2bb10e20556003843a0f7aaa33acd55f6d2 Parents: afe9f03 Author: Xiangrui Meng m...@databricks.com Authored: Thu Aug 20 20:01:13 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 20:01:13 2015 -0700 -- core/src/main/scala/org/apache/spark/annotation/Since.scala | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/cdd9a2bb/core/src/main/scala/org/apache/spark/annotation/Since.scala -- diff --git a/core/src/main/scala/org/apache/spark/annotation/Since.scala b/core/src/main/scala/org/apache/spark/annotation/Since.scala index fa59393..af483e3 100644 --- a/core/src/main/scala/org/apache/spark/annotation/Since.scala +++ b/core/src/main/scala/org/apache/spark/annotation/Since.scala @@ -18,6 +18,7 @@ package org.apache.spark.annotation import scala.annotation.StaticAnnotation +import scala.annotation.meta._ /** * A Scala annotation that specifies the Spark version when a definition was added. @@ -25,4 +26,5 @@ import scala.annotation.StaticAnnotation * hence works for overridden methods that inherit API documentation directly from parents. * The limitation is that it does not show up in the generated Java API documentation. */ +@param @field @getter @setter @beanGetter @beanSetter private[spark] class Since(version: String) extends StaticAnnotation - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier
Repository: spark Updated Branches: refs/heads/master cdd9a2bb1 - dcfe0c5cd [SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier Added user guide for multilayer perceptron classifier: - Simplified description of the multilayer perceptron classifier - Example code for Scala and Java Author: Alexander Ulanov na...@yandex.ru Closes #8262 from avulanov/SPARK-9846-mlpc-docs. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dcfe0c5c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dcfe0c5c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dcfe0c5c Branch: refs/heads/master Commit: dcfe0c5cde953b31c5bfeb6e41d1fc9b333241eb Parents: cdd9a2b Author: Alexander Ulanov na...@yandex.ru Authored: Thu Aug 20 20:02:27 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 20:02:27 2015 -0700 -- docs/ml-ann.md | 123 ++ docs/ml-guide.md | 1 + 2 files changed, 124 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/dcfe0c5c/docs/ml-ann.md -- diff --git a/docs/ml-ann.md b/docs/ml-ann.md new file mode 100644 index 000..d5ddd92 --- /dev/null +++ b/docs/ml-ann.md @@ -0,0 +1,123 @@ +--- +layout: global +title: Multilayer perceptron classifier - ML +displayTitle: a href=ml-guide.htmlML/a - Multilayer perceptron classifier +--- + + +`\[ +\newcommand{\R}{\mathbb{R}} +\newcommand{\E}{\mathbb{E}} +\newcommand{\x}{\mathbf{x}} +\newcommand{\y}{\mathbf{y}} +\newcommand{\wv}{\mathbf{w}} +\newcommand{\av}{\mathbf{\alpha}} +\newcommand{\bv}{\mathbf{b}} +\newcommand{\N}{\mathbb{N}} +\newcommand{\id}{\mathbf{I}} +\newcommand{\ind}{\mathbf{1}} +\newcommand{\0}{\mathbf{0}} +\newcommand{\unit}{\mathbf{e}} +\newcommand{\one}{\mathbf{1}} +\newcommand{\zero}{\mathbf{0}} +\]` + + +Multilayer perceptron classifier (MLPC) is a classifier based on the [feedforward artificial neural network](https://en.wikipedia.org/wiki/Feedforward_neural_network). +MLPC consists of multiple layers of nodes. +Each layer is fully connected to the next layer in the network. Nodes in the input layer represent the input data. All other nodes maps inputs to the outputs +by performing linear combination of the inputs with the node's weights `$\wv$` and bias `$\bv$` and applying an activation function. +It can be written in matrix form for MLPC with `$K+1$` layers as follows: +`\[ +\mathrm{y}(\x) = \mathrm{f_K}(...\mathrm{f_2}(\wv_2^T\mathrm{f_1}(\wv_1^T \x+b_1)+b_2)...+b_K) +\]` +Nodes in intermediate layers use sigmoid (logistic) function: +`\[ +\mathrm{f}(z_i) = \frac{1}{1 + e^{-z_i}} +\]` +Nodes in the output layer use softmax function: +`\[ +\mathrm{f}(z_i) = \frac{e^{z_i}}{\sum_{k=1}^N e^{z_k}} +\]` +The number of nodes `$N$` in the output layer corresponds to the number of classes. + +MLPC employes backpropagation for learning the model. We use logistic loss function for optimization and L-BFGS as optimization routine. + +**Examples** + +div class=codetabs + +div data-lang=scala markdown=1 + +{% highlight scala %} +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier +import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator +import org.apache.spark.mllib.util.MLUtils +import org.apache.spark.sql.Row + +// Load training data +val data = MLUtils.loadLibSVMFile(sc, data/mllib/sample_multiclass_classification_data.txt).toDF() +// Split the data into train and test +val splits = data.randomSplit(Array(0.6, 0.4), seed = 1234L) +val train = splits(0) +val test = splits(1) +// specify layers for the neural network: +// input layer of size 4 (features), two intermediate of size 5 and 4 and output of size 3 (classes) +val layers = Array[Int](4, 5, 4, 3) +// create the trainer and set its parameters +val trainer = new MultilayerPerceptronClassifier() + .setLayers(layers) + .setBlockSize(128) + .setSeed(1234L) + .setMaxIter(100) +// train the model +val model = trainer.fit(train) +// compute precision on the test set +val result = model.transform(test) +val predictionAndLabels = result.select(prediction, label) +val evaluator = new MulticlassClassificationEvaluator() + .setMetricName(precision) +println(Precision: + evaluator.evaluate(predictionAndLabels)) +{% endhighlight %} + +/div + +div data-lang=java markdown=1 + +{% highlight java %} +import org.apache.spark.api.java.JavaRDD; +import org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel; +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier; +import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator; +import org.apache.spark.mllib.regression.LabeledPoint; +import org.apache.spark.mllib.util.MLUtils; + +//
spark git commit: [SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier
Repository: spark Updated Branches: refs/heads/branch-1.5 04ef52a5b - e5e601739 [SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier Added user guide for multilayer perceptron classifier: - Simplified description of the multilayer perceptron classifier - Example code for Scala and Java Author: Alexander Ulanov na...@yandex.ru Closes #8262 from avulanov/SPARK-9846-mlpc-docs. (cherry picked from commit dcfe0c5cde953b31c5bfeb6e41d1fc9b333241eb) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e5e60173 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e5e60173 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e5e60173 Branch: refs/heads/branch-1.5 Commit: e5e601739b1ec49da95f30657bfcfc691e35d9be Parents: 04ef52a Author: Alexander Ulanov na...@yandex.ru Authored: Thu Aug 20 20:02:27 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 20:02:34 2015 -0700 -- docs/ml-ann.md | 123 ++ docs/ml-guide.md | 1 + 2 files changed, 124 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e5e60173/docs/ml-ann.md -- diff --git a/docs/ml-ann.md b/docs/ml-ann.md new file mode 100644 index 000..d5ddd92 --- /dev/null +++ b/docs/ml-ann.md @@ -0,0 +1,123 @@ +--- +layout: global +title: Multilayer perceptron classifier - ML +displayTitle: a href=ml-guide.htmlML/a - Multilayer perceptron classifier +--- + + +`\[ +\newcommand{\R}{\mathbb{R}} +\newcommand{\E}{\mathbb{E}} +\newcommand{\x}{\mathbf{x}} +\newcommand{\y}{\mathbf{y}} +\newcommand{\wv}{\mathbf{w}} +\newcommand{\av}{\mathbf{\alpha}} +\newcommand{\bv}{\mathbf{b}} +\newcommand{\N}{\mathbb{N}} +\newcommand{\id}{\mathbf{I}} +\newcommand{\ind}{\mathbf{1}} +\newcommand{\0}{\mathbf{0}} +\newcommand{\unit}{\mathbf{e}} +\newcommand{\one}{\mathbf{1}} +\newcommand{\zero}{\mathbf{0}} +\]` + + +Multilayer perceptron classifier (MLPC) is a classifier based on the [feedforward artificial neural network](https://en.wikipedia.org/wiki/Feedforward_neural_network). +MLPC consists of multiple layers of nodes. +Each layer is fully connected to the next layer in the network. Nodes in the input layer represent the input data. All other nodes maps inputs to the outputs +by performing linear combination of the inputs with the node's weights `$\wv$` and bias `$\bv$` and applying an activation function. +It can be written in matrix form for MLPC with `$K+1$` layers as follows: +`\[ +\mathrm{y}(\x) = \mathrm{f_K}(...\mathrm{f_2}(\wv_2^T\mathrm{f_1}(\wv_1^T \x+b_1)+b_2)...+b_K) +\]` +Nodes in intermediate layers use sigmoid (logistic) function: +`\[ +\mathrm{f}(z_i) = \frac{1}{1 + e^{-z_i}} +\]` +Nodes in the output layer use softmax function: +`\[ +\mathrm{f}(z_i) = \frac{e^{z_i}}{\sum_{k=1}^N e^{z_k}} +\]` +The number of nodes `$N$` in the output layer corresponds to the number of classes. + +MLPC employes backpropagation for learning the model. We use logistic loss function for optimization and L-BFGS as optimization routine. + +**Examples** + +div class=codetabs + +div data-lang=scala markdown=1 + +{% highlight scala %} +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier +import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator +import org.apache.spark.mllib.util.MLUtils +import org.apache.spark.sql.Row + +// Load training data +val data = MLUtils.loadLibSVMFile(sc, data/mllib/sample_multiclass_classification_data.txt).toDF() +// Split the data into train and test +val splits = data.randomSplit(Array(0.6, 0.4), seed = 1234L) +val train = splits(0) +val test = splits(1) +// specify layers for the neural network: +// input layer of size 4 (features), two intermediate of size 5 and 4 and output of size 3 (classes) +val layers = Array[Int](4, 5, 4, 3) +// create the trainer and set its parameters +val trainer = new MultilayerPerceptronClassifier() + .setLayers(layers) + .setBlockSize(128) + .setSeed(1234L) + .setMaxIter(100) +// train the model +val model = trainer.fit(train) +// compute precision on the test set +val result = model.transform(test) +val predictionAndLabels = result.select(prediction, label) +val evaluator = new MulticlassClassificationEvaluator() + .setMetricName(precision) +println(Precision: + evaluator.evaluate(predictionAndLabels)) +{% endhighlight %} + +/div + +div data-lang=java markdown=1 + +{% highlight java %} +import org.apache.spark.api.java.JavaRDD; +import org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel; +import org.apache.spark.ml.classification.MultilayerPerceptronClassifier; +import
spark git commit: [SPARK-10140] [DOC] add target fields to @Since
Repository: spark Updated Branches: refs/heads/branch-1.5 988e838a2 - 04ef52a5b [SPARK-10140] [DOC] add target fields to @Since so constructors parameters and public fields can be annotated. rxin MechCoder Author: Xiangrui Meng m...@databricks.com Closes #8344 from mengxr/SPARK-10140.2. (cherry picked from commit cdd9a2bb10e20556003843a0f7aaa33acd55f6d2) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/04ef52a5 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/04ef52a5 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/04ef52a5 Branch: refs/heads/branch-1.5 Commit: 04ef52a5bcbe8fba2941af235b8cda1255d4af8d Parents: 988e838 Author: Xiangrui Meng m...@databricks.com Authored: Thu Aug 20 20:01:13 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 20:01:27 2015 -0700 -- core/src/main/scala/org/apache/spark/annotation/Since.scala | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/04ef52a5/core/src/main/scala/org/apache/spark/annotation/Since.scala -- diff --git a/core/src/main/scala/org/apache/spark/annotation/Since.scala b/core/src/main/scala/org/apache/spark/annotation/Since.scala index fa59393..af483e3 100644 --- a/core/src/main/scala/org/apache/spark/annotation/Since.scala +++ b/core/src/main/scala/org/apache/spark/annotation/Since.scala @@ -18,6 +18,7 @@ package org.apache.spark.annotation import scala.annotation.StaticAnnotation +import scala.annotation.meta._ /** * A Scala annotation that specifies the Spark version when a definition was added. @@ -25,4 +26,5 @@ import scala.annotation.StaticAnnotation * hence works for overridden methods that inherit API documentation directly from parents. * The limitation is that it does not show up in the generated Java API documentation. */ +@param @field @getter @setter @beanGetter @beanSetter private[spark] class Since(version: String) extends StaticAnnotation - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 1.5.1-SNAPSHOT
Preparing development version 1.5.1-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/988e838a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/988e838a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/988e838a Branch: refs/heads/branch-1.5 Commit: 988e838a2fe381052f3018df4f31a55434be75ea Parents: 4c56ad7 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 16:24:12 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 16:24:12 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 3ef7d6f..7b41ebb 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index 684e07b..16bf17c 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 6b082ad..beb547f 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 9ef1eda..3926b79 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.1-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index fe878e6..bdd5037 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.1-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/988e838a/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index
[1/2] spark git commit: Preparing Spark release v1.5.0-rc1
Repository: spark Updated Branches: refs/heads/branch-1.5 2f47e099d - a1785e3f5 Preparing Spark release v1.5.0-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/19b92c87 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/19b92c87 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/19b92c87 Branch: refs/heads/branch-1.5 Commit: 19b92c87a38fd3594e60e96dbf1f85e92163be36 Parents: 2f47e09 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 11:06:31 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 11:06:31 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index e9c6d26..1567a0e 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index ed5c37e..a0e68c1 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 4f79d71..b76aab4 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index e6884b0..e0afb4a 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index e05e431..a4093bb 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/19b92c87/external/flume-sink/pom.xml
[2/2] spark git commit: [SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array
[SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array I caught SPARK-10136 while adding more test cases to `ParquetAvroCompatibilitySuite`. Actual bug fix code lies in `CatalystRowConverter.scala`. Author: Cheng Lian l...@databricks.com Closes #8341 from liancheng/spark-10136/parquet-avro-nested-primitive-array. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/85f9a613 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/85f9a613 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/85f9a613 Branch: refs/heads/master Commit: 85f9a61357994da5023b08b0a8a2eb09388ce7f8 Parents: 39e91fe Author: Cheng Lian l...@databricks.com Authored: Thu Aug 20 11:00:24 2015 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Thu Aug 20 11:00:29 2015 -0700 -- .../parquet/CatalystReadSupport.scala | 1 - .../parquet/CatalystRowConverter.scala | 24 +- sql/core/src/test/avro/parquet-compat.avdl | 19 +- sql/core/src/test/avro/parquet-compat.avpr | 54 +- .../parquet/test/avro/AvroArrayOfArray.java | 142 .../parquet/test/avro/AvroMapOfArray.java | 142 .../test/avro/AvroNonNullableArrays.java| 196 + .../test/avro/AvroOptionalPrimitives.java | 466 +++ .../parquet/test/avro/AvroPrimitives.java | 461 +++ .../parquet/test/avro/CompatibilityTest.java| 2 +- .../parquet/test/avro/ParquetAvroCompat.java| 821 +-- .../parquet/ParquetAvroCompatibilitySuite.scala | 227 +++-- .../parquet/ParquetCompatibilityTest.scala | 7 + 13 files changed, 1718 insertions(+), 844 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/85f9a613/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala index a4679bb..3f8353a 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala @@ -61,7 +61,6 @@ private[parquet] class CatalystReadSupport extends ReadSupport[InternalRow] with | |Parquet form: |$parquetRequestedSchema - | |Catalyst form: |$catalystRequestedSchema .stripMargin http://git-wip-us.apache.org/repos/asf/spark/blob/85f9a613/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala index 18c5b50..d2c2db5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala @@ -25,11 +25,12 @@ import scala.collection.mutable.ArrayBuffer import org.apache.parquet.column.Dictionary import org.apache.parquet.io.api.{Binary, Converter, GroupConverter, PrimitiveConverter} -import org.apache.parquet.schema.OriginalType.{LIST, INT_32, UTF8} +import org.apache.parquet.schema.OriginalType.{INT_32, LIST, UTF8} import org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.DOUBLE import org.apache.parquet.schema.Type.Repetition import org.apache.parquet.schema.{GroupType, MessageType, PrimitiveType, Type} +import org.apache.spark.Logging import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.util.DateTimeUtils @@ -145,7 +146,16 @@ private[parquet] class CatalystRowConverter( parquetType: GroupType, catalystType: StructType, updater: ParentContainerUpdater) - extends CatalystGroupConverter(updater) { + extends CatalystGroupConverter(updater) with Logging { + + logDebug( +sBuilding row converter for the following schema: + | + |Parquet form: + |$parquetType + |Catalyst form: + |${catalystType.prettyJson} + .stripMargin) /** * Updater used together with field converters within a [[CatalystRowConverter]]. It propagates @@ -464,9 +474,15 @@ private[parquet] class CatalystRowConverter( override def getConverter(fieldIndex: Int): Converter = converter -
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [created] 19b92c87a - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which broke snapshot publishing for Scala 2.11
Repository: spark Updated Branches: refs/heads/master 85f9a6135 - 12de34833 [SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which broke snapshot publishing for Scala 2.11 The current `release-build.sh` has a typo which breaks snapshot publication for Scala 2.11. We should change the Scala version to 2.11 and clean before building a 2.11 snapshot. Author: Josh Rosen joshro...@databricks.com Closes #8325 from JoshRosen/fix-2.11-snapshots. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/12de3483 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/12de3483 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/12de3483 Branch: refs/heads/master Commit: 12de348332108f8c0c5bdad1d4cfac89b952b0f8 Parents: 85f9a61 Author: Josh Rosen joshro...@databricks.com Authored: Thu Aug 20 11:31:03 2015 -0700 Committer: Josh Rosen joshro...@databricks.com Committed: Thu Aug 20 11:31:03 2015 -0700 -- dev/create-release/release-build.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/12de3483/dev/create-release/release-build.sh -- diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index 399c73e..d0b3a54 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -225,9 +225,9 @@ if [[ $1 == publish-snapshot ]]; then $MVN -DzincPort=$ZINC_PORT --settings $tmp_settings -DskipTests $PUBLISH_PROFILES \ -Phive-thriftserver deploy - ./dev/change-scala-version.sh 2.10 + ./dev/change-scala-version.sh 2.11 $MVN -DzincPort=$ZINC_PORT -Dscala-2.11 --settings $tmp_settings \ --DskipTests $PUBLISH_PROFILES deploy +-DskipTests $PUBLISH_PROFILES clean deploy # Clean-up Zinc nailgun process /usr/sbin/lsof -P |grep $ZINC_PORT | grep LISTEN | awk '{ print $2; }' | xargs kill - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which broke snapshot publishing for Scala 2.11
Repository: spark Updated Branches: refs/heads/branch-1.5 a1785e3f5 - 6026f4fd7 [SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which broke snapshot publishing for Scala 2.11 The current `release-build.sh` has a typo which breaks snapshot publication for Scala 2.11. We should change the Scala version to 2.11 and clean before building a 2.11 snapshot. Author: Josh Rosen joshro...@databricks.com Closes #8325 from JoshRosen/fix-2.11-snapshots. (cherry picked from commit 12de348332108f8c0c5bdad1d4cfac89b952b0f8) Signed-off-by: Josh Rosen joshro...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6026f4fd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6026f4fd Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6026f4fd Branch: refs/heads/branch-1.5 Commit: 6026f4fd729f4c7158a87c5c706fde866d7aae60 Parents: a1785e3 Author: Josh Rosen joshro...@databricks.com Authored: Thu Aug 20 11:31:03 2015 -0700 Committer: Josh Rosen joshro...@databricks.com Committed: Thu Aug 20 11:31:21 2015 -0700 -- dev/create-release/release-build.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/6026f4fd/dev/create-release/release-build.sh -- diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index 399c73e..d0b3a54 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -225,9 +225,9 @@ if [[ $1 == publish-snapshot ]]; then $MVN -DzincPort=$ZINC_PORT --settings $tmp_settings -DskipTests $PUBLISH_PROFILES \ -Phive-thriftserver deploy - ./dev/change-scala-version.sh 2.10 + ./dev/change-scala-version.sh 2.11 $MVN -DzincPort=$ZINC_PORT -Dscala-2.11 --settings $tmp_settings \ --DskipTests $PUBLISH_PROFILES deploy +-DskipTests $PUBLISH_PROFILES clean deploy # Clean-up Zinc nailgun process /usr/sbin/lsof -P |grep $ZINC_PORT | grep LISTEN | awk '{ print $2; }' | xargs kill - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: [SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array
[SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array I caught SPARK-10136 while adding more test cases to `ParquetAvroCompatibilitySuite`. Actual bug fix code lies in `CatalystRowConverter.scala`. Author: Cheng Lian l...@databricks.com Closes #8341 from liancheng/spark-10136/parquet-avro-nested-primitive-array. (cherry picked from commit 85f9a61357994da5023b08b0a8a2eb09388ce7f8) Signed-off-by: Michael Armbrust mich...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2f47e099 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2f47e099 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2f47e099 Branch: refs/heads/branch-1.5 Commit: 2f47e099d31275f03ad372483e1bb23a322044f5 Parents: a7027e6 Author: Cheng Lian l...@databricks.com Authored: Thu Aug 20 11:00:24 2015 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Thu Aug 20 11:02:02 2015 -0700 -- .../parquet/CatalystReadSupport.scala | 1 - .../parquet/CatalystRowConverter.scala | 24 +- sql/core/src/test/avro/parquet-compat.avdl | 19 +- sql/core/src/test/avro/parquet-compat.avpr | 54 +- .../parquet/test/avro/AvroArrayOfArray.java | 142 .../parquet/test/avro/AvroMapOfArray.java | 142 .../test/avro/AvroNonNullableArrays.java| 196 + .../test/avro/AvroOptionalPrimitives.java | 466 +++ .../parquet/test/avro/AvroPrimitives.java | 461 +++ .../parquet/test/avro/CompatibilityTest.java| 2 +- .../parquet/test/avro/ParquetAvroCompat.java| 821 +-- .../parquet/ParquetAvroCompatibilitySuite.scala | 227 +++-- .../parquet/ParquetCompatibilityTest.scala | 7 + 13 files changed, 1718 insertions(+), 844 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2f47e099/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala index a4679bb..3f8353a 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystReadSupport.scala @@ -61,7 +61,6 @@ private[parquet] class CatalystReadSupport extends ReadSupport[InternalRow] with | |Parquet form: |$parquetRequestedSchema - | |Catalyst form: |$catalystRequestedSchema .stripMargin http://git-wip-us.apache.org/repos/asf/spark/blob/2f47e099/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala index 18c5b50..d2c2db5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala @@ -25,11 +25,12 @@ import scala.collection.mutable.ArrayBuffer import org.apache.parquet.column.Dictionary import org.apache.parquet.io.api.{Binary, Converter, GroupConverter, PrimitiveConverter} -import org.apache.parquet.schema.OriginalType.{LIST, INT_32, UTF8} +import org.apache.parquet.schema.OriginalType.{INT_32, LIST, UTF8} import org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.DOUBLE import org.apache.parquet.schema.Type.Repetition import org.apache.parquet.schema.{GroupType, MessageType, PrimitiveType, Type} +import org.apache.spark.Logging import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.util.DateTimeUtils @@ -145,7 +146,16 @@ private[parquet] class CatalystRowConverter( parquetType: GroupType, catalystType: StructType, updater: ParentContainerUpdater) - extends CatalystGroupConverter(updater) { + extends CatalystGroupConverter(updater) with Logging { + + logDebug( +sBuilding row converter for the following schema: + | + |Parquet form: + |$parquetType + |Catalyst form: + |${catalystType.prettyJson} + .stripMargin) /** * Updater used together with field converters within a [[CatalystRowConverter]]. It propagates @@ -464,9 +474,15 @@
[1/2] spark git commit: [SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array
Repository: spark Updated Branches: refs/heads/master 39e91fe2f - 85f9a6135 http://git-wip-us.apache.org/repos/asf/spark/blob/85f9a613/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java -- diff --git a/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java b/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java index 681cacb..ef12d19 100644 --- a/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java +++ b/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java @@ -7,22 +7,8 @@ package org.apache.spark.sql.execution.datasources.parquet.test.avro; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class ParquetAvroCompat extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { - public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\type\:\record\,\name\:\ParquetAvroCompat\,\namespace\:\org.apache.spark.sql.execution.datasources.parquet.test.avro\,\fields\:[{\name\:\bool_column\,\type\:\boolean\},{\name\:\int_column\,\type\:\int\},{\name\:\long_column\,\type\:\long\},{\name\:\float_column\,\type\:\float\},{\name\:\double_column\,\type\:\double\},{\name\:\binary_column\,\type\:\bytes\},{\name\:\string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}},{\name\:\maybe_bool_column\,\type\:[\null\,\boolean\]},{\name\:\maybe_int_column\,\type\:[\null\,\int\]},{\name\:\maybe_long_column\,\type\:[\null\,\long\]},{\name\:\maybe_float_column\,\type\:[\null\,\float\]},{\name\:\maybe_double_column\,\type\:[\null\,\double\]},{\name\:\maybe_binary_column\,\type\:[\null\,\bytes\]},{\ name\:\maybe_string_column\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]},{\name\:\strings_column\,\type\:{\type\:\array\,\items\:{\type\:\string\,\avro.java.string\:\String\}}},{\name\:\string_to_int_column\,\type\:{\type\:\map\,\values\:\int\,\avro.java.string\:\String\}},{\name\:\complex_column\,\type\:{\type\:\map\,\values\:{\type\:\array\,\items\:{\type\:\record\,\name\:\Nested\,\fields\:[{\name\:\nested_ints_column\,\type\:{\type\:\array\,\items\:\int\}},{\name\:\nested_string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}}]}},\avro.java.string\:\String\}}]}); + public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\type\:\record\,\name\:\ParquetAvroCompat\,\namespace\:\org.apache.spark.sql.execution.datasources.parquet.test.avro\,\fields\:[{\name\:\strings_column\,\type\:{\type\:\array\,\items\:{\type\:\string\,\avro.java.string\:\String\}}},{\name\:\string_to_int_column\,\type\:{\type\:\map\,\values\:\int\,\avro.java.string\:\String\}},{\name\:\complex_column\,\type\:{\type\:\map\,\values\:{\type\:\array\,\items\:{\type\:\record\,\name\:\Nested\,\fields\:[{\name\:\nested_ints_column\,\type\:{\type\:\array\,\items\:\int\}},{\name\:\nested_string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}}]}},\avro.java.string\:\String\}}]}); public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } - @Deprecated public boolean bool_column; - @Deprecated public int int_column; - @Deprecated public long long_column; - @Deprecated public float float_column; - @Deprecated public double double_column; - @Deprecated public java.nio.ByteBuffer binary_column; - @Deprecated public java.lang.String string_column; - @Deprecated public java.lang.Boolean maybe_bool_column; - @Deprecated public java.lang.Integer maybe_int_column; - @Deprecated public java.lang.Long maybe_long_column; - @Deprecated public java.lang.Float maybe_float_column; - @Deprecated public java.lang.Double maybe_double_column; - @Deprecated public java.nio.ByteBuffer maybe_binary_column; - @Deprecated public java.lang.String maybe_string_column; @Deprecated public java.util.Listjava.lang.String strings_column; @Deprecated public java.util.Mapjava.lang.String,java.lang.Integer string_to_int_column; @Deprecated public java.util.Mapjava.lang.String,java.util.Listorg.apache.spark.sql.execution.datasources.parquet.test.avro.Nested complex_column; @@ -37,21 +23,7 @@ public class ParquetAvroCompat extends org.apache.avro.specific.SpecificRecordBa /** * All-args constructor. */ - public ParquetAvroCompat(java.lang.Boolean bool_column, java.lang.Integer int_column, java.lang.Long long_column, java.lang.Float float_column, java.lang.Double double_column, java.nio.ByteBuffer binary_column, java.lang.String string_column, java.lang.Boolean maybe_bool_column, java.lang.Integer maybe_int_column, java.lang.Long maybe_long_column, java.lang.Float maybe_float_column,
[1/2] spark git commit: [SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array
Repository: spark Updated Branches: refs/heads/branch-1.5 a7027e6d3 - 2f47e099d http://git-wip-us.apache.org/repos/asf/spark/blob/2f47e099/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java -- diff --git a/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java b/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java index 681cacb..ef12d19 100644 --- a/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java +++ b/sql/core/src/test/gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro/ParquetAvroCompat.java @@ -7,22 +7,8 @@ package org.apache.spark.sql.execution.datasources.parquet.test.avro; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class ParquetAvroCompat extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { - public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\type\:\record\,\name\:\ParquetAvroCompat\,\namespace\:\org.apache.spark.sql.execution.datasources.parquet.test.avro\,\fields\:[{\name\:\bool_column\,\type\:\boolean\},{\name\:\int_column\,\type\:\int\},{\name\:\long_column\,\type\:\long\},{\name\:\float_column\,\type\:\float\},{\name\:\double_column\,\type\:\double\},{\name\:\binary_column\,\type\:\bytes\},{\name\:\string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}},{\name\:\maybe_bool_column\,\type\:[\null\,\boolean\]},{\name\:\maybe_int_column\,\type\:[\null\,\int\]},{\name\:\maybe_long_column\,\type\:[\null\,\long\]},{\name\:\maybe_float_column\,\type\:[\null\,\float\]},{\name\:\maybe_double_column\,\type\:[\null\,\double\]},{\name\:\maybe_binary_column\,\type\:[\null\,\bytes\]},{\ name\:\maybe_string_column\,\type\:[\null\,{\type\:\string\,\avro.java.string\:\String\}]},{\name\:\strings_column\,\type\:{\type\:\array\,\items\:{\type\:\string\,\avro.java.string\:\String\}}},{\name\:\string_to_int_column\,\type\:{\type\:\map\,\values\:\int\,\avro.java.string\:\String\}},{\name\:\complex_column\,\type\:{\type\:\map\,\values\:{\type\:\array\,\items\:{\type\:\record\,\name\:\Nested\,\fields\:[{\name\:\nested_ints_column\,\type\:{\type\:\array\,\items\:\int\}},{\name\:\nested_string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}}]}},\avro.java.string\:\String\}}]}); + public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse({\type\:\record\,\name\:\ParquetAvroCompat\,\namespace\:\org.apache.spark.sql.execution.datasources.parquet.test.avro\,\fields\:[{\name\:\strings_column\,\type\:{\type\:\array\,\items\:{\type\:\string\,\avro.java.string\:\String\}}},{\name\:\string_to_int_column\,\type\:{\type\:\map\,\values\:\int\,\avro.java.string\:\String\}},{\name\:\complex_column\,\type\:{\type\:\map\,\values\:{\type\:\array\,\items\:{\type\:\record\,\name\:\Nested\,\fields\:[{\name\:\nested_ints_column\,\type\:{\type\:\array\,\items\:\int\}},{\name\:\nested_string_column\,\type\:{\type\:\string\,\avro.java.string\:\String\}}]}},\avro.java.string\:\String\}}]}); public static org.apache.avro.Schema getClassSchema() { return SCHEMA$; } - @Deprecated public boolean bool_column; - @Deprecated public int int_column; - @Deprecated public long long_column; - @Deprecated public float float_column; - @Deprecated public double double_column; - @Deprecated public java.nio.ByteBuffer binary_column; - @Deprecated public java.lang.String string_column; - @Deprecated public java.lang.Boolean maybe_bool_column; - @Deprecated public java.lang.Integer maybe_int_column; - @Deprecated public java.lang.Long maybe_long_column; - @Deprecated public java.lang.Float maybe_float_column; - @Deprecated public java.lang.Double maybe_double_column; - @Deprecated public java.nio.ByteBuffer maybe_binary_column; - @Deprecated public java.lang.String maybe_string_column; @Deprecated public java.util.Listjava.lang.String strings_column; @Deprecated public java.util.Mapjava.lang.String,java.lang.Integer string_to_int_column; @Deprecated public java.util.Mapjava.lang.String,java.util.Listorg.apache.spark.sql.execution.datasources.parquet.test.avro.Nested complex_column; @@ -37,21 +23,7 @@ public class ParquetAvroCompat extends org.apache.avro.specific.SpecificRecordBa /** * All-args constructor. */ - public ParquetAvroCompat(java.lang.Boolean bool_column, java.lang.Integer int_column, java.lang.Long long_column, java.lang.Float float_column, java.lang.Double double_column, java.nio.ByteBuffer binary_column, java.lang.String string_column, java.lang.Boolean maybe_bool_column, java.lang.Integer maybe_int_column, java.lang.Long maybe_long_column, java.lang.Float maybe_float_column,
spark git commit: [SQL] [MINOR] remove unnecessary class
Repository: spark Updated Branches: refs/heads/master 12de34833 - 907df2fce [SQL] [MINOR] remove unnecessary class This class is identical to `org.apache.spark.sql.execution.datasources.jdbc. DefaultSource` and is not needed. Author: Wenchen Fan cloud0...@outlook.com Closes #8334 from cloud-fan/minor. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/907df2fc Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/907df2fc Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/907df2fc Branch: refs/heads/master Commit: 907df2fce00d2cbc9fae371344f05f800e0d2726 Parents: 12de348 Author: Wenchen Fan cloud0...@outlook.com Authored: Thu Aug 20 13:51:54 2015 -0700 Committer: Reynold Xin r...@databricks.com Committed: Thu Aug 20 13:51:54 2015 -0700 -- .../execution/datasources/DefaultSource.scala | 64 1 file changed, 64 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/907df2fc/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DefaultSource.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DefaultSource.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DefaultSource.scala deleted file mode 100644 index 6e4cc4d..000 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DefaultSource.scala +++ /dev/null @@ -1,64 +0,0 @@ -/* -* Licensed to the Apache Software Foundation (ASF) under one or more -* contributor license agreements. See the NOTICE file distributed with -* this work for additional information regarding copyright ownership. -* The ASF licenses this file to You under the Apache License, Version 2.0 -* (the License); you may not use this file except in compliance with -* the License. You may obtain a copy of the License at -* -*http://www.apache.org/licenses/LICENSE-2.0 -* -* Unless required by applicable law or agreed to in writing, software -* distributed under the License is distributed on an AS IS BASIS, -* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -* See the License for the specific language governing permissions and -* limitations under the License. -*/ - -package org.apache.spark.sql.execution.datasources - -import java.util.Properties - -import org.apache.spark.sql.SQLContext -import org.apache.spark.sql.execution.datasources.jdbc.{JDBCRelation, JDBCPartitioningInfo, DriverRegistry} -import org.apache.spark.sql.sources.{BaseRelation, DataSourceRegister, RelationProvider} - - -class DefaultSource extends RelationProvider with DataSourceRegister { - - override def shortName(): String = jdbc - - /** Returns a new base relation with the given parameters. */ - override def createRelation( - sqlContext: SQLContext, - parameters: Map[String, String]): BaseRelation = { -val url = parameters.getOrElse(url, sys.error(Option 'url' not specified)) -val driver = parameters.getOrElse(driver, null) -val table = parameters.getOrElse(dbtable, sys.error(Option 'dbtable' not specified)) -val partitionColumn = parameters.getOrElse(partitionColumn, null) -val lowerBound = parameters.getOrElse(lowerBound, null) -val upperBound = parameters.getOrElse(upperBound, null) -val numPartitions = parameters.getOrElse(numPartitions, null) - -if (driver != null) DriverRegistry.register(driver) - -if (partitionColumn != null - (lowerBound == null || upperBound == null || numPartitions == null)) { - sys.error(Partitioning incompletely specified) -} - -val partitionInfo = if (partitionColumn == null) { - null -} else { - JDBCPartitioningInfo( -partitionColumn, -lowerBound.toLong, -upperBound.toLong, -numPartitions.toInt) -} -val parts = JDBCRelation.columnPartition(partitionInfo) -val properties = new Properties() // Additional properties that we will pass to getConnection -parameters.foreach(kv = properties.setProperty(kv._1, kv._2)) -JDBCRelation(url, table, parts, properties)(sqlContext) - } -} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [deleted] 19b92c87a - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150810 [deleted] 3369ad9bc - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-snapshot-20150811 [deleted] 158b2ea7a - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-preview-20150812 [deleted] cedce9bdb - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in aggregation.
Repository: spark Updated Branches: refs/heads/branch-1.5 675e22494 - 5be517584 [SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in aggregation. This improves performance by ~ 20 - 30% in one of my local test and should fix the performance regression from 1.4 to 1.5 on ss_max. Author: Reynold Xin r...@databricks.com Closes #8332 from rxin/SPARK-10100. (cherry picked from commit b4f4e91c395cb69ced61d9ff1492d1b814f96828) Signed-off-by: Yin Huai yh...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5be51758 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5be51758 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5be51758 Branch: refs/heads/branch-1.5 Commit: 5be517584be0c78dc4641a4aa14ea9da05ed344d Parents: 675e224 Author: Reynold Xin r...@databricks.com Authored: Thu Aug 20 07:53:27 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Thu Aug 20 07:53:40 2015 -0700 -- .../execution/aggregate/TungstenAggregate.scala | 2 +- .../aggregate/TungstenAggregationIterator.scala | 30 ++-- 2 files changed, 22 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/5be51758/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala index 99f51ba..ba379d3 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala @@ -104,7 +104,7 @@ case class TungstenAggregate( } else { // This is a grouped aggregate and the input iterator is empty, // so return an empty iterator. - Iterator[UnsafeRow]() + Iterator.empty } } else { aggregationIterator.start(parentIterator) http://git-wip-us.apache.org/repos/asf/spark/blob/5be51758/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala index af7e0fc..26fdbc8 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala @@ -357,18 +357,30 @@ class TungstenAggregationIterator( // sort-based aggregation (by calling switchToSortBasedAggregation). private def processInputs(): Unit = { assert(inputIter != null, attempted to process input when iterator was null) -while (!sortBased inputIter.hasNext) { - val newInput = inputIter.next() - numInputRows += 1 - val groupingKey = groupProjection.apply(newInput) +if (groupingExpressions.isEmpty) { + // If there is no grouping expressions, we can just reuse the same buffer over and over again. + // Note that it would be better to eliminate the hash map entirely in the future. + val groupingKey = groupProjection.apply(null) val buffer: UnsafeRow = hashMap.getAggregationBufferFromUnsafeRow(groupingKey) - if (buffer == null) { -// buffer == null means that we could not allocate more memory. -// Now, we need to spill the map and switch to sort-based aggregation. -switchToSortBasedAggregation(groupingKey, newInput) - } else { + while (inputIter.hasNext) { +val newInput = inputIter.next() +numInputRows += 1 processRow(buffer, newInput) } +} else { + while (!sortBased inputIter.hasNext) { +val newInput = inputIter.next() +numInputRows += 1 +val groupingKey = groupProjection.apply(newInput) +val buffer: UnsafeRow = hashMap.getAggregationBufferFromUnsafeRow(groupingKey) +if (buffer == null) { + // buffer == null means that we could not allocate more memory. + // Now, we need to spill the map and switch to sort-based aggregation. + switchToSortBasedAggregation(groupingKey, newInput) +} else { + processRow(buffer, newInput) +} + } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail:
spark git commit: [SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in aggregation.
Repository: spark Updated Branches: refs/heads/master 43e013542 - b4f4e91c3 [SPARK-10100] [SQL] Eliminate hash table lookup if there is no grouping key in aggregation. This improves performance by ~ 20 - 30% in one of my local test and should fix the performance regression from 1.4 to 1.5 on ss_max. Author: Reynold Xin r...@databricks.com Closes #8332 from rxin/SPARK-10100. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b4f4e91c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b4f4e91c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b4f4e91c Branch: refs/heads/master Commit: b4f4e91c395cb69ced61d9ff1492d1b814f96828 Parents: 43e0135 Author: Reynold Xin r...@databricks.com Authored: Thu Aug 20 07:53:27 2015 -0700 Committer: Yin Huai yh...@databricks.com Committed: Thu Aug 20 07:53:27 2015 -0700 -- .../execution/aggregate/TungstenAggregate.scala | 2 +- .../aggregate/TungstenAggregationIterator.scala | 30 ++-- 2 files changed, 22 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b4f4e91c/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala index 99f51ba..ba379d3 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregate.scala @@ -104,7 +104,7 @@ case class TungstenAggregate( } else { // This is a grouped aggregate and the input iterator is empty, // so return an empty iterator. - Iterator[UnsafeRow]() + Iterator.empty } } else { aggregationIterator.start(parentIterator) http://git-wip-us.apache.org/repos/asf/spark/blob/b4f4e91c/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala index af7e0fc..26fdbc8 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala @@ -357,18 +357,30 @@ class TungstenAggregationIterator( // sort-based aggregation (by calling switchToSortBasedAggregation). private def processInputs(): Unit = { assert(inputIter != null, attempted to process input when iterator was null) -while (!sortBased inputIter.hasNext) { - val newInput = inputIter.next() - numInputRows += 1 - val groupingKey = groupProjection.apply(newInput) +if (groupingExpressions.isEmpty) { + // If there is no grouping expressions, we can just reuse the same buffer over and over again. + // Note that it would be better to eliminate the hash map entirely in the future. + val groupingKey = groupProjection.apply(null) val buffer: UnsafeRow = hashMap.getAggregationBufferFromUnsafeRow(groupingKey) - if (buffer == null) { -// buffer == null means that we could not allocate more memory. -// Now, we need to spill the map and switch to sort-based aggregation. -switchToSortBasedAggregation(groupingKey, newInput) - } else { + while (inputIter.hasNext) { +val newInput = inputIter.next() +numInputRows += 1 processRow(buffer, newInput) } +} else { + while (!sortBased inputIter.hasNext) { +val newInput = inputIter.next() +numInputRows += 1 +val groupingKey = groupProjection.apply(newInput) +val buffer: UnsafeRow = hashMap.getAggregationBufferFromUnsafeRow(groupingKey) +if (buffer == null) { + // buffer == null means that we could not allocate more memory. + // Now, we need to spill the map and switch to sort-based aggregation. + switchToSortBasedAggregation(groupingKey, newInput) +} else { + processRow(buffer, newInput) +} + } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[1/2] spark git commit: Preparing Spark release v1.5.0-rc1
Repository: spark Updated Branches: refs/heads/branch-1.5 6026f4fd7 - eac31abdf Preparing Spark release v1.5.0-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/99eeac8c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/99eeac8c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/99eeac8c Branch: refs/heads/branch-1.5 Commit: 99eeac8cca176cfb64d5fd354a0a7c279613bbc9 Parents: 6026f4f Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 12:43:08 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 12:43:08 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index e9c6d26..1567a0e 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index ed5c37e..a0e68c1 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 4f79d71..b76aab4 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index e6884b0..e0afb4a 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index e05e431..a4093bb 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0-rc1/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/99eeac8c/external/flume-sink/pom.xml
[2/2] spark git commit: Preparing development version 1.5.0-SNAPSHOT
Preparing development version 1.5.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eac31abd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eac31abd Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eac31abd Branch: refs/heads/branch-1.5 Commit: eac31abdf2fe89abb3dec2fa9285f918ae682d58 Parents: 99eeac8 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 12:43:13 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 12:43:13 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 1567a0e..e9c6d26 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-rc1/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index a0e68c1..ed5c37e 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-rc1/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index b76aab4..4f79d71 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-rc1/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index e0afb4a..e6884b0 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-rc1/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index a4093bb..e05e431 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-rc1/version +version1.5.0-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/eac31abd/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [created] 99eeac8cc - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10138] [ML] move setters to MultilayerPerceptronClassifier and add Java test suite
Repository: spark Updated Branches: refs/heads/branch-1.5 eac31abdf - 2e0d2a9cc [SPARK-10138] [ML] move setters to MultilayerPerceptronClassifier and add Java test suite Otherwise, setters do not return self type. jkbradley avulanov Author: Xiangrui Meng m...@databricks.com Closes #8342 from mengxr/SPARK-10138. (cherry picked from commit 2a3d98aae285aba39786e9809f96de412a130f39) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2e0d2a9c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2e0d2a9c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2e0d2a9c Branch: refs/heads/branch-1.5 Commit: 2e0d2a9cc3cb7021e3bdd032d079cf6c8916c725 Parents: eac31ab Author: Xiangrui Meng m...@databricks.com Authored: Thu Aug 20 14:47:04 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 14:47:11 2015 -0700 -- .../MultilayerPerceptronClassifier.scala| 54 +++--- ...JavaMultilayerPerceptronClassifierSuite.java | 74 2 files changed, 101 insertions(+), 27 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2e0d2a9c/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala b/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala index ccca4ec..1e5b0bc 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala @@ -42,9 +42,6 @@ private[ml] trait MultilayerPerceptronParams extends PredictorParams ParamValidators.arrayLengthGt(1) ) - /** @group setParam */ - def setLayers(value: Array[Int]): this.type = set(layers, value) - /** @group getParam */ final def getLayers: Array[Int] = $(layers) @@ -61,33 +58,9 @@ private[ml] trait MultilayerPerceptronParams extends PredictorParams it is adjusted to the size of this data. Recommended size is between 10 and 1000, ParamValidators.gt(0)) - /** @group setParam */ - def setBlockSize(value: Int): this.type = set(blockSize, value) - /** @group getParam */ final def getBlockSize: Int = $(blockSize) - /** - * Set the maximum number of iterations. - * Default is 100. - * @group setParam - */ - def setMaxIter(value: Int): this.type = set(maxIter, value) - - /** - * Set the convergence tolerance of iterations. - * Smaller value will lead to higher accuracy with the cost of more iterations. - * Default is 1E-4. - * @group setParam - */ - def setTol(value: Double): this.type = set(tol, value) - - /** - * Set the seed for weights initialization. - * @group setParam - */ - def setSeed(value: Long): this.type = set(seed, value) - setDefault(maxIter - 100, tol - 1e-4, layers - Array(1, 1), blockSize - 128) } @@ -136,6 +109,33 @@ class MultilayerPerceptronClassifier(override val uid: String) def this() = this(Identifiable.randomUID(mlpc)) + /** @group setParam */ + def setLayers(value: Array[Int]): this.type = set(layers, value) + + /** @group setParam */ + def setBlockSize(value: Int): this.type = set(blockSize, value) + + /** + * Set the maximum number of iterations. + * Default is 100. + * @group setParam + */ + def setMaxIter(value: Int): this.type = set(maxIter, value) + + /** + * Set the convergence tolerance of iterations. + * Smaller value will lead to higher accuracy with the cost of more iterations. + * Default is 1E-4. + * @group setParam + */ + def setTol(value: Double): this.type = set(tol, value) + + /** + * Set the seed for weights initialization. + * @group setParam + */ + def setSeed(value: Long): this.type = set(seed, value) + override def copy(extra: ParamMap): MultilayerPerceptronClassifier = defaultCopy(extra) /** http://git-wip-us.apache.org/repos/asf/spark/blob/2e0d2a9c/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java -- diff --git a/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java b/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java new file mode 100644 index 000..ec6b4bf --- /dev/null +++ b/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or
spark git commit: [SPARK-10108] Add since tags to mllib.feature
Repository: spark Updated Branches: refs/heads/master 2a3d98aae - 7cfc0750e [SPARK-10108] Add since tags to mllib.feature Author: MechCoder manojkumarsivaraj...@gmail.com Closes #8309 from MechCoder/tags_feature. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7cfc0750 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7cfc0750 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7cfc0750 Branch: refs/heads/master Commit: 7cfc0750e14f2c1b3847e4720cc02150253525a9 Parents: 2a3d98a Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Thu Aug 20 14:56:08 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 14:56:08 2015 -0700 -- .../spark/mllib/feature/ChiSqSelector.scala | 12 +--- .../spark/mllib/feature/ElementwiseProduct.scala | 4 +++- .../apache/spark/mllib/feature/HashingTF.scala | 11 ++- .../org/apache/spark/mllib/feature/IDF.scala | 8 +++- .../apache/spark/mllib/feature/Normalizer.scala | 5 - .../org/apache/spark/mllib/feature/PCA.scala | 9 - .../spark/mllib/feature/StandardScaler.scala | 13 - .../spark/mllib/feature/VectorTransformer.scala | 6 +- .../apache/spark/mllib/feature/Word2Vec.scala| 19 ++- 9 files changed, 76 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7cfc0750/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala index 5f8c1de..fdd974d 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala @@ -19,7 +19,7 @@ package org.apache.spark.mllib.feature import scala.collection.mutable.ArrayBuilder -import org.apache.spark.annotation.Experimental +import org.apache.spark.annotation.{Experimental, Since} import org.apache.spark.mllib.linalg.{DenseVector, SparseVector, Vector, Vectors} import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.stat.Statistics @@ -31,8 +31,10 @@ import org.apache.spark.rdd.RDD * * @param selectedFeatures list of indices to select (filter). Must be ordered asc */ +@Since(1.3.0) @Experimental -class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransformer { +class ChiSqSelectorModel ( + @Since(1.3.0) val selectedFeatures: Array[Int]) extends VectorTransformer { require(isSorted(selectedFeatures), Array has to be sorted asc) @@ -52,6 +54,7 @@ class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransf * @param vector vector to be transformed. * @return transformed vector. */ + @Since(1.3.0) override def transform(vector: Vector): Vector = { compress(vector, selectedFeatures) } @@ -107,8 +110,10 @@ class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransf * @param numTopFeatures number of features that selector will select * (ordered by statistic value descending) */ +@Since(1.3.0) @Experimental -class ChiSqSelector (val numTopFeatures: Int) extends Serializable { +class ChiSqSelector ( + @Since(1.3.0) val numTopFeatures: Int) extends Serializable { /** * Returns a ChiSquared feature selector. @@ -117,6 +122,7 @@ class ChiSqSelector (val numTopFeatures: Int) extends Serializable { * Real-valued features will be treated as categorical for each distinct value. * Apply feature discretizer before using this function. */ + @Since(1.3.0) def fit(data: RDD[LabeledPoint]): ChiSqSelectorModel = { val indices = Statistics.chiSqTest(data) .zipWithIndex.sortBy { case (res, _) = -res.statistic } http://git-wip-us.apache.org/repos/asf/spark/blob/7cfc0750/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala index d67fe6c..33e2d17 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala @@ -17,7 +17,7 @@ package org.apache.spark.mllib.feature -import org.apache.spark.annotation.Experimental +import org.apache.spark.annotation.{Experimental, Since} import org.apache.spark.mllib.linalg._ /** @@ -27,6 +27,7 @@ import
spark git commit: [SPARK-9245] [MLLIB] LDA topic assignments
Repository: spark Updated Branches: refs/heads/master 7cfc0750e - eaafe139f [SPARK-9245] [MLLIB] LDA topic assignments For each (document, term) pair, return top topic. Note that instances of (doc, term) pairs within a document (a.k.a. tokens) are exchangeable, so we should provide an estimate per document-term, rather than per token. CC: rotationsymmetry mengxr Author: Joseph K. Bradley jos...@databricks.com Closes #8329 from jkbradley/lda-topic-assignments. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eaafe139 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eaafe139 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eaafe139 Branch: refs/heads/master Commit: eaafe139f881d6105996373c9b11f2ccd91b5b3e Parents: 7cfc075 Author: Joseph K. Bradley jos...@databricks.com Authored: Thu Aug 20 15:01:31 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 15:01:31 2015 -0700 -- .../spark/mllib/clustering/LDAModel.scala | 51 ++-- .../spark/mllib/clustering/LDAOptimizer.scala | 2 +- .../spark/mllib/clustering/JavaLDASuite.java| 7 +++ .../spark/mllib/clustering/LDASuite.scala | 21 +++- 4 files changed, 74 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/eaafe139/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala b/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala index b70e380..6bc68a4 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala @@ -17,7 +17,7 @@ package org.apache.spark.mllib.clustering -import breeze.linalg.{DenseMatrix = BDM, DenseVector = BDV, argtopk, normalize, sum} +import breeze.linalg.{DenseMatrix = BDM, DenseVector = BDV, argmax, argtopk, normalize, sum} import breeze.numerics.{exp, lgamma} import org.apache.hadoop.fs.Path import org.json4s.DefaultFormats @@ -438,7 +438,7 @@ object LocalLDAModel extends Loader[LocalLDAModel] { Loader.checkSchema[Data](dataFrame.schema) val topics = dataFrame.collect() val vocabSize = topics(0).getAs[Vector](0).size - val k = topics.size + val k = topics.length val brzTopics = BDM.zeros[Double](vocabSize, k) topics.foreach { case Row(vec: Vector, ind: Int) = @@ -610,6 +610,50 @@ class DistributedLDAModel private[clustering] ( } } + /** + * Return the top topic for each (doc, term) pair. I.e., for each document, what is the most + * likely topic generating each term? + * + * @return RDD of (doc ID, assignment of top topic index for each term), + * where the assignment is specified via a pair of zippable arrays + * (term indices, topic indices). Note that terms will be omitted if not present in + * the document. + */ + lazy val topicAssignments: RDD[(Long, Array[Int], Array[Int])] = { +// For reference, compare the below code with the core part of EMLDAOptimizer.next(). +val eta = topicConcentration +val W = vocabSize +val alpha = docConcentration(0) +val N_k = globalTopicTotals +val sendMsg: EdgeContext[TopicCounts, TokenCount, (Array[Int], Array[Int])] = Unit = + (edgeContext) = { +// E-STEP: Compute gamma_{wjk} (smoothed topic distributions). +val scaledTopicDistribution: TopicCounts = + computePTopic(edgeContext.srcAttr, edgeContext.dstAttr, N_k, W, eta, alpha) +// For this (doc j, term w), send top topic k to doc vertex. +val topTopic: Int = argmax(scaledTopicDistribution) +val term: Int = index2term(edgeContext.dstId) +edgeContext.sendToSrc((Array(term), Array(topTopic))) + } +val mergeMsg: ((Array[Int], Array[Int]), (Array[Int], Array[Int])) = (Array[Int], Array[Int]) = + (terms_topics0, terms_topics1) = { +(terms_topics0._1 ++ terms_topics1._1, terms_topics0._2 ++ terms_topics1._2) + } +// M-STEP: Aggregation computes new N_{kj}, N_{wk} counts. +val perDocAssignments = + graph.aggregateMessages[(Array[Int], Array[Int])](sendMsg, mergeMsg).filter(isDocumentVertex) +perDocAssignments.map { case (docID: Long, (terms: Array[Int], topics: Array[Int])) = + // TODO: Avoid zip, which is inefficient. + val (sortedTerms, sortedTopics) = terms.zip(topics).sortBy(_._1).unzip + (docID, sortedTerms.toArray, sortedTopics.toArray) +} + } + + /** Java-friendly version of [[topicAssignments]] */ + lazy val javaTopicAssignments: JavaRDD[(java.lang.Long, Array[Int], Array[Int])] =
spark git commit: [SPARK-9400] [SQL] codegen for StringLocate
Repository: spark Updated Branches: refs/heads/master eaafe139f - afe9f03fd [SPARK-9400] [SQL] codegen for StringLocate This is based on #7779 , thanks to tarekauel . Fix the conflict and nullability. Closes #7779 and #8274 . Author: Tarek Auel tarek.a...@googlemail.com Author: Davies Liu dav...@databricks.com Closes #8330 from davies/stringLocate. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/afe9f03f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/afe9f03f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/afe9f03f Branch: refs/heads/master Commit: afe9f03fd964d1e8604d02feee8d6970efbe6009 Parents: eaafe13 Author: Tarek Auel tarek.a...@googlemail.com Authored: Thu Aug 20 15:10:13 2015 -0700 Committer: Davies Liu davies@gmail.com Committed: Thu Aug 20 15:10:13 2015 -0700 -- .../expressions/stringExpressions.scala | 28 +++- 1 file changed, 27 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/afe9f03f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala index 3c23f2e..b60d318 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala @@ -409,13 +409,14 @@ case class SubstringIndex(strExpr: Expression, delimExpr: Expression, countExpr: * in given string after position pos. */ case class StringLocate(substr: Expression, str: Expression, start: Expression) - extends TernaryExpression with ImplicitCastInputTypes with CodegenFallback { + extends TernaryExpression with ImplicitCastInputTypes { def this(substr: Expression, str: Expression) = { this(substr, str, Literal(0)) } override def children: Seq[Expression] = substr :: str :: start :: Nil + override def nullable: Boolean = substr.nullable || str.nullable override def dataType: DataType = IntegerType override def inputTypes: Seq[DataType] = Seq(StringType, StringType, IntegerType) @@ -441,6 +442,31 @@ case class StringLocate(substr: Expression, str: Expression, start: Expression) } } + override protected def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { +val substrGen = substr.gen(ctx) +val strGen = str.gen(ctx) +val startGen = start.gen(ctx) +s + int ${ev.primitive} = 0; + boolean ${ev.isNull} = false; + ${startGen.code} + if (!${startGen.isNull}) { +${substrGen.code} +if (!${substrGen.isNull}) { + ${strGen.code} + if (!${strGen.isNull}) { +${ev.primitive} = ${strGen.primitive}.indexOf(${substrGen.primitive}, + ${startGen.primitive}) + 1; + } else { +${ev.isNull} = true; + } +} else { + ${ev.isNull} = true; +} + } + + } + override def prettyName: String = locate } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-10138] [ML] move setters to MultilayerPerceptronClassifier and add Java test suite
Repository: spark Updated Branches: refs/heads/master 907df2fce - 2a3d98aae [SPARK-10138] [ML] move setters to MultilayerPerceptronClassifier and add Java test suite Otherwise, setters do not return self type. jkbradley avulanov Author: Xiangrui Meng m...@databricks.com Closes #8342 from mengxr/SPARK-10138. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2a3d98aa Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2a3d98aa Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2a3d98aa Branch: refs/heads/master Commit: 2a3d98aae285aba39786e9809f96de412a130f39 Parents: 907df2f Author: Xiangrui Meng m...@databricks.com Authored: Thu Aug 20 14:47:04 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 14:47:04 2015 -0700 -- .../MultilayerPerceptronClassifier.scala| 54 +++--- ...JavaMultilayerPerceptronClassifierSuite.java | 74 2 files changed, 101 insertions(+), 27 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2a3d98aa/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala b/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala index ccca4ec..1e5b0bc 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala @@ -42,9 +42,6 @@ private[ml] trait MultilayerPerceptronParams extends PredictorParams ParamValidators.arrayLengthGt(1) ) - /** @group setParam */ - def setLayers(value: Array[Int]): this.type = set(layers, value) - /** @group getParam */ final def getLayers: Array[Int] = $(layers) @@ -61,33 +58,9 @@ private[ml] trait MultilayerPerceptronParams extends PredictorParams it is adjusted to the size of this data. Recommended size is between 10 and 1000, ParamValidators.gt(0)) - /** @group setParam */ - def setBlockSize(value: Int): this.type = set(blockSize, value) - /** @group getParam */ final def getBlockSize: Int = $(blockSize) - /** - * Set the maximum number of iterations. - * Default is 100. - * @group setParam - */ - def setMaxIter(value: Int): this.type = set(maxIter, value) - - /** - * Set the convergence tolerance of iterations. - * Smaller value will lead to higher accuracy with the cost of more iterations. - * Default is 1E-4. - * @group setParam - */ - def setTol(value: Double): this.type = set(tol, value) - - /** - * Set the seed for weights initialization. - * @group setParam - */ - def setSeed(value: Long): this.type = set(seed, value) - setDefault(maxIter - 100, tol - 1e-4, layers - Array(1, 1), blockSize - 128) } @@ -136,6 +109,33 @@ class MultilayerPerceptronClassifier(override val uid: String) def this() = this(Identifiable.randomUID(mlpc)) + /** @group setParam */ + def setLayers(value: Array[Int]): this.type = set(layers, value) + + /** @group setParam */ + def setBlockSize(value: Int): this.type = set(blockSize, value) + + /** + * Set the maximum number of iterations. + * Default is 100. + * @group setParam + */ + def setMaxIter(value: Int): this.type = set(maxIter, value) + + /** + * Set the convergence tolerance of iterations. + * Smaller value will lead to higher accuracy with the cost of more iterations. + * Default is 1E-4. + * @group setParam + */ + def setTol(value: Double): this.type = set(tol, value) + + /** + * Set the seed for weights initialization. + * @group setParam + */ + def setSeed(value: Long): this.type = set(seed, value) + override def copy(extra: ParamMap): MultilayerPerceptronClassifier = defaultCopy(extra) /** http://git-wip-us.apache.org/repos/asf/spark/blob/2a3d98aa/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java -- diff --git a/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java b/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java new file mode 100644 index 000..ec6b4bf --- /dev/null +++ b/mllib/src/test/java/org/apache/spark/ml/classification/JavaMultilayerPerceptronClassifierSuite.java @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding
spark git commit: [SPARK-10108] Add since tags to mllib.feature
Repository: spark Updated Branches: refs/heads/branch-1.5 2e0d2a9cc - 560ec1268 [SPARK-10108] Add since tags to mllib.feature Author: MechCoder manojkumarsivaraj...@gmail.com Closes #8309 from MechCoder/tags_feature. (cherry picked from commit 7cfc0750e14f2c1b3847e4720cc02150253525a9) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/560ec126 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/560ec126 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/560ec126 Branch: refs/heads/branch-1.5 Commit: 560ec1268b824acc01d347a3fbc78ac16216a9b0 Parents: 2e0d2a9 Author: MechCoder manojkumarsivaraj...@gmail.com Authored: Thu Aug 20 14:56:08 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 14:59:55 2015 -0700 -- .../spark/mllib/feature/ChiSqSelector.scala | 12 +--- .../spark/mllib/feature/ElementwiseProduct.scala | 4 +++- .../apache/spark/mllib/feature/HashingTF.scala | 11 ++- .../org/apache/spark/mllib/feature/IDF.scala | 8 +++- .../apache/spark/mllib/feature/Normalizer.scala | 5 - .../org/apache/spark/mllib/feature/PCA.scala | 9 - .../spark/mllib/feature/StandardScaler.scala | 13 - .../spark/mllib/feature/VectorTransformer.scala | 6 +- .../apache/spark/mllib/feature/Word2Vec.scala| 19 ++- 9 files changed, 76 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/560ec126/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala index 5f8c1de..fdd974d 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala @@ -19,7 +19,7 @@ package org.apache.spark.mllib.feature import scala.collection.mutable.ArrayBuilder -import org.apache.spark.annotation.Experimental +import org.apache.spark.annotation.{Experimental, Since} import org.apache.spark.mllib.linalg.{DenseVector, SparseVector, Vector, Vectors} import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.stat.Statistics @@ -31,8 +31,10 @@ import org.apache.spark.rdd.RDD * * @param selectedFeatures list of indices to select (filter). Must be ordered asc */ +@Since(1.3.0) @Experimental -class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransformer { +class ChiSqSelectorModel ( + @Since(1.3.0) val selectedFeatures: Array[Int]) extends VectorTransformer { require(isSorted(selectedFeatures), Array has to be sorted asc) @@ -52,6 +54,7 @@ class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransf * @param vector vector to be transformed. * @return transformed vector. */ + @Since(1.3.0) override def transform(vector: Vector): Vector = { compress(vector, selectedFeatures) } @@ -107,8 +110,10 @@ class ChiSqSelectorModel (val selectedFeatures: Array[Int]) extends VectorTransf * @param numTopFeatures number of features that selector will select * (ordered by statistic value descending) */ +@Since(1.3.0) @Experimental -class ChiSqSelector (val numTopFeatures: Int) extends Serializable { +class ChiSqSelector ( + @Since(1.3.0) val numTopFeatures: Int) extends Serializable { /** * Returns a ChiSquared feature selector. @@ -117,6 +122,7 @@ class ChiSqSelector (val numTopFeatures: Int) extends Serializable { * Real-valued features will be treated as categorical for each distinct value. * Apply feature discretizer before using this function. */ + @Since(1.3.0) def fit(data: RDD[LabeledPoint]): ChiSqSelectorModel = { val indices = Statistics.chiSqTest(data) .zipWithIndex.sortBy { case (res, _) = -res.statistic } http://git-wip-us.apache.org/repos/asf/spark/blob/560ec126/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala index d67fe6c..33e2d17 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/ElementwiseProduct.scala @@ -17,7 +17,7 @@ package org.apache.spark.mllib.feature -import org.apache.spark.annotation.Experimental +import
spark git commit: [SPARK-9245] [MLLIB] LDA topic assignments
Repository: spark Updated Branches: refs/heads/branch-1.5 560ec1268 - 2beea65bf [SPARK-9245] [MLLIB] LDA topic assignments For each (document, term) pair, return top topic. Note that instances of (doc, term) pairs within a document (a.k.a. tokens) are exchangeable, so we should provide an estimate per document-term, rather than per token. CC: rotationsymmetry mengxr Author: Joseph K. Bradley jos...@databricks.com Closes #8329 from jkbradley/lda-topic-assignments. (cherry picked from commit eaafe139f881d6105996373c9b11f2ccd91b5b3e) Signed-off-by: Xiangrui Meng m...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2beea65b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2beea65b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2beea65b Branch: refs/heads/branch-1.5 Commit: 2beea65bfbbf4a94ad6b7ca5e4c24f59089f6099 Parents: 560ec12 Author: Joseph K. Bradley jos...@databricks.com Authored: Thu Aug 20 15:01:31 2015 -0700 Committer: Xiangrui Meng m...@databricks.com Committed: Thu Aug 20 15:01:37 2015 -0700 -- .../spark/mllib/clustering/LDAModel.scala | 51 ++-- .../spark/mllib/clustering/LDAOptimizer.scala | 2 +- .../spark/mllib/clustering/JavaLDASuite.java| 7 +++ .../spark/mllib/clustering/LDASuite.scala | 21 +++- 4 files changed, 74 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/2beea65b/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala b/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala index b70e380..6bc68a4 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala @@ -17,7 +17,7 @@ package org.apache.spark.mllib.clustering -import breeze.linalg.{DenseMatrix = BDM, DenseVector = BDV, argtopk, normalize, sum} +import breeze.linalg.{DenseMatrix = BDM, DenseVector = BDV, argmax, argtopk, normalize, sum} import breeze.numerics.{exp, lgamma} import org.apache.hadoop.fs.Path import org.json4s.DefaultFormats @@ -438,7 +438,7 @@ object LocalLDAModel extends Loader[LocalLDAModel] { Loader.checkSchema[Data](dataFrame.schema) val topics = dataFrame.collect() val vocabSize = topics(0).getAs[Vector](0).size - val k = topics.size + val k = topics.length val brzTopics = BDM.zeros[Double](vocabSize, k) topics.foreach { case Row(vec: Vector, ind: Int) = @@ -610,6 +610,50 @@ class DistributedLDAModel private[clustering] ( } } + /** + * Return the top topic for each (doc, term) pair. I.e., for each document, what is the most + * likely topic generating each term? + * + * @return RDD of (doc ID, assignment of top topic index for each term), + * where the assignment is specified via a pair of zippable arrays + * (term indices, topic indices). Note that terms will be omitted if not present in + * the document. + */ + lazy val topicAssignments: RDD[(Long, Array[Int], Array[Int])] = { +// For reference, compare the below code with the core part of EMLDAOptimizer.next(). +val eta = topicConcentration +val W = vocabSize +val alpha = docConcentration(0) +val N_k = globalTopicTotals +val sendMsg: EdgeContext[TopicCounts, TokenCount, (Array[Int], Array[Int])] = Unit = + (edgeContext) = { +// E-STEP: Compute gamma_{wjk} (smoothed topic distributions). +val scaledTopicDistribution: TopicCounts = + computePTopic(edgeContext.srcAttr, edgeContext.dstAttr, N_k, W, eta, alpha) +// For this (doc j, term w), send top topic k to doc vertex. +val topTopic: Int = argmax(scaledTopicDistribution) +val term: Int = index2term(edgeContext.dstId) +edgeContext.sendToSrc((Array(term), Array(topTopic))) + } +val mergeMsg: ((Array[Int], Array[Int]), (Array[Int], Array[Int])) = (Array[Int], Array[Int]) = + (terms_topics0, terms_topics1) = { +(terms_topics0._1 ++ terms_topics1._1, terms_topics0._2 ++ terms_topics1._2) + } +// M-STEP: Aggregation computes new N_{kj}, N_{wk} counts. +val perDocAssignments = + graph.aggregateMessages[(Array[Int], Array[Int])](sendMsg, mergeMsg).filter(isDocumentVertex) +perDocAssignments.map { case (docID: Long, (terms: Array[Int], topics: Array[Int])) = + // TODO: Avoid zip, which is inefficient. + val (sortedTerms, sortedTopics) = terms.zip(topics).sortBy(_._1).unzip + (docID, sortedTerms.toArray, sortedTopics.toArray) +} + } + + /**
Git Push Summary
Repository: spark Updated Tags: refs/tags/v1.5.0-rc1 [deleted] 99eeac8cc - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[2/2] spark git commit: Preparing development version 1.5.0-SNAPSHOT
Preparing development version 1.5.0-SNAPSHOT Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/175c1d9c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/175c1d9c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/175c1d9c Branch: refs/heads/branch-1.5 Commit: 175c1d9c90d47a469568e14b4b90a440b8d9e95c Parents: d837d51 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 15:33:10 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 15:33:10 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 3ef7d6f..e9c6d26 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index 684e07b..ed5c37e 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 6b082ad..4f79d71 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index 9ef1eda..e6884b0 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.0-SNAPSHOT/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index fe878e6..e05e431 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0/version +version1.5.0-SNAPSHOT/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/175c1d9c/external/flume-sink/pom.xml -- diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml index
[1/2] spark git commit: Preparing Spark release v1.5.0-rc1
Repository: spark Updated Branches: refs/heads/branch-1.5 2beea65bf - 175c1d9c9 Preparing Spark release v1.5.0-rc1 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d837d51d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d837d51d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d837d51d Branch: refs/heads/branch-1.5 Commit: d837d51d54510310f7bae05cd331c0c23946404c Parents: 2beea65 Author: Patrick Wendell pwend...@gmail.com Authored: Thu Aug 20 15:33:04 2015 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Thu Aug 20 15:33:04 2015 -0700 -- assembly/pom.xml| 2 +- bagel/pom.xml | 2 +- core/pom.xml| 2 +- examples/pom.xml| 2 +- external/flume-assembly/pom.xml | 2 +- external/flume-sink/pom.xml | 2 +- external/flume/pom.xml | 2 +- external/kafka-assembly/pom.xml | 2 +- external/kafka/pom.xml | 2 +- external/mqtt-assembly/pom.xml | 2 +- external/mqtt/pom.xml | 2 +- external/twitter/pom.xml| 2 +- external/zeromq/pom.xml | 2 +- extras/java8-tests/pom.xml | 2 +- extras/kinesis-asl-assembly/pom.xml | 2 +- extras/kinesis-asl/pom.xml | 2 +- extras/spark-ganglia-lgpl/pom.xml | 2 +- graphx/pom.xml | 2 +- launcher/pom.xml| 2 +- mllib/pom.xml | 2 +- network/common/pom.xml | 2 +- network/shuffle/pom.xml | 2 +- network/yarn/pom.xml| 2 +- pom.xml | 2 +- repl/pom.xml| 2 +- sql/catalyst/pom.xml| 2 +- sql/core/pom.xml| 2 +- sql/hive-thriftserver/pom.xml | 2 +- sql/hive/pom.xml| 2 +- streaming/pom.xml | 2 +- tools/pom.xml | 2 +- unsafe/pom.xml | 2 +- yarn/pom.xml| 2 +- 33 files changed, 33 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index e9c6d26..3ef7d6f 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/bagel/pom.xml -- diff --git a/bagel/pom.xml b/bagel/pom.xml index ed5c37e..684e07b 100644 --- a/bagel/pom.xml +++ b/bagel/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/core/pom.xml -- diff --git a/core/pom.xml b/core/pom.xml index 4f79d71..6b082ad 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/examples/pom.xml -- diff --git a/examples/pom.xml b/examples/pom.xml index e6884b0..9ef1eda 100644 --- a/examples/pom.xml +++ b/examples/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/external/flume-assembly/pom.xml -- diff --git a/external/flume-assembly/pom.xml b/external/flume-assembly/pom.xml index e05e431..fe878e6 100644 --- a/external/flume-assembly/pom.xml +++ b/external/flume-assembly/pom.xml @@ -21,7 +21,7 @@ parent groupIdorg.apache.spark/groupId artifactIdspark-parent_2.10/artifactId -version1.5.0-SNAPSHOT/version +version1.5.0/version relativePath../../pom.xml/relativePath /parent http://git-wip-us.apache.org/repos/asf/spark/blob/d837d51d/external/flume-sink/pom.xml -- diff --git