[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22812 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4562/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22812 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22309: [SPARK-20384][SQL] Support value class in schema ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22309#discussion_r228730753 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala --- @@ -358,4 +368,20 @@ class ScalaReflectionSuite extends SparkFunSuite { assert(numberOfCheckedArguments(deserializerFor[(java.lang.Double, Int)]) == 1) assert(numberOfCheckedArguments(deserializerFor[(java.lang.Integer, java.lang.Integer)]) == 0) } + + test("schema for case class that is a value class") { +val schema = schemaFor[TestingValueClass.IntWrapper] +assert(schema === Schema(IntegerType, nullable = false)) + } + + test("schema for case class that contains value class fields") { +val schema = schemaFor[TestingValueClass.ValueClassData] +assert(schema === Schema( + StructType(Seq( +StructField("intField", IntegerType, nullable = false), +StructField("wrappedInt", IntegerType, nullable = false), --- End diff -- to confirm, scala value class for primitive type can't be null? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22812 **[Test build #98147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98147/testReport)** for PR 22812 at commit [`517bebf`](https://github.com/apache/spark/commit/517bebfb1e49f2315019696a50b657dcf715778c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22812 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22309: [SPARK-20384][SQL] Support value class in schema of Data...
Github user mt40 commented on the issue: https://github.com/apache/spark/pull/22309 @cloud-fan It works now. Actually, top level value class is supported from [SPARK-17368](https://issues.apache.org/jira/browse/SPARK-17368). I try to maintain that and add support for nested value class in this patch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22858: [SPARK-24709][SQL][2.4] use str instead of basestring in...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22858 @HyukjinKwon thanks for the information! Shall we replace `str` with `basestring` in `functions.py` for master branch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22858#discussion_r228730582 --- Diff: python/pyspark/sql/functions.py --- @@ -2326,7 +2326,7 @@ def schema_of_json(json): >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect() [Row(json=u'struct')] """ -if isinstance(json, basestring): +if isinstance(json, str): --- End diff -- shall we apply it to 2.4? I'm not aware of the background, why we did not put ``` if sys.version >= '3': basestring = str ``` in 2.4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98145/testReport)** for PR 22784 at commit [`18af032`](https://github.com/apache/spark/commit/18af0325e95552a00983983224795e71f2e66204). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98145/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98144/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98144 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98144/testReport)** for PR 22784 at commit [`094594b`](https://github.com/apache/spark/commit/094594bf63a22be65bac7b31932d5d870f1142d3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22809 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22809 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98139/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22809 **[Test build #98139 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98139/testReport)** for PR 22809 at commit [`07205de`](https://github.com/apache/spark/commit/07205dea343539cb812622205fd0534b77f183d0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21732 **[Test build #98146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98146/testReport)** for PR 21732 at commit [`fec1cac`](https://github.com/apache/spark/commit/fec1cac2c5f8fa5226001820c24fe5fc8304fe3f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4561/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22817: [SPARK-25816][SQL] Fix attribute resolution in nested ex...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22817 The fix looks fine to me. cc @cloud-fan @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22817: [SPARK-25816][SQL] Fix attribute resolution in ne...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22817#discussion_r228729920 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2578,4 +2578,12 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { Row ("abc", 1)) } } + + test("SPARK-25816 ResolveReferences works with nested extractors") { +val df0 = Seq((1, Map(1 -> "a")), (2, Map(2 -> "b"))).toDF("1", "2") +val df1 = df0.select($"1".as("2"), $"2".as("1")) +val df2 = df1.filter($"1"(map_keys($"1")(0)) > "a") --- End diff -- We are unable to resolve the expressions in `extraction` of `UnresolvedExtractValue`. We can simplify the expression in the `extraction`. For example, `df1.filter($"1"($"2") > "a")`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98145/testReport)** for PR 22784 at commit [`18af032`](https://github.com/apache/spark/commit/18af0325e95552a00983983224795e71f2e66204). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228729214 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,21 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65535") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) +// Eigen values should not be negative +assert(!pca.explainedVariance.values.exists(_ < 0)) + +// Norm of the principle component should be 1.0 --- End diff -- Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228729215 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,21 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65535") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) +// Eigen values should not be negative +assert(!pca.explainedVariance.values.exists(_ < 0)) --- End diff -- Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228729208 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/PCA.scala --- @@ -49,7 +50,16 @@ class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int) { "Try reducing the parameter k for PCA, or reduce the input feature " + "vector dimension to make this tractable.") -val mat = new RowMatrix(sources) +val mat = if (numFeatures > 65535) { + val meanVector = Statistics.colStats(sources).mean --- End diff -- I have modified. Thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228729201 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,14 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65500") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) --- End diff -- Thanks. I have updated the test case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21950 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21950 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98137/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98144 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98144/testReport)** for PR 22784 at commit [`094594b`](https://github.com/apache/spark/commit/094594bf63a22be65bac7b31932d5d870f1142d3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21950 **[Test build #98137 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98137/testReport)** for PR 21950 at commit [`ddfe945`](https://github.com/apache/spark/commit/ddfe945ef161e59fc2bbc1a12bf40563d2bdd400). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19601 Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22863 Thanks @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22863: [SPARK-25859][ML]add scala/java/python example an...
Github user huaxingao closed the pull request at: https://github.com/apache/spark/pull/22863 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22863 please close this PR. thx --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22843 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22863 merged to 2.4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22863: [SPARK-25859][ML]add scala/java/python example an...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22863#discussion_r228727912 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/PrefixSpanExample.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.examples.ml + +// scalastyle:off println + +// $example on$ +import org.apache.spark.ml.fpm.PrefixSpan +// $example off$ +import org.apache.spark.sql.SparkSession + +/** + * An example demonstrating PrefixSpan. + * Run with + * {{{ + * bin/run-example ml.PrefixSpanExample + * }}} + */ +object PrefixSpanExample { + + def main(args: Array[String]): Unit = { +val spark = SparkSession + .builder + .appName(s"${this.getClass.getSimpleName}") + .getOrCreate() +import spark.implicits._ + +// $example on$ +val smallTestData = Seq( + Seq(Seq(1, 2), Seq(3)), + Seq(Seq(1), Seq(3, 2), Seq(1, 2)), + Seq(Seq(1, 2), Seq(5)), + Seq(Seq(6))) + +val df = smallTestData.toDF("sequence") +val result = new PrefixSpan() + .setMinSupport(0.5) + .setMaxPatternLength(5) + .setMaxLocalProjDBSize(3200) + .findFrequentSequentialPatterns(df) + .show() +// $example off$ + +spark.stop() + } +} +// scalastyle:on println --- End diff -- nit: looks like println is not used in example here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.en...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22865 **[Test build #98143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98143/testReport)** for PR 22865 at commit [`af8a85a`](https://github.com/apache/spark/commit/af8a85ae4a1e477801bf104af6d4909cd822ba01). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.en...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22865 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22843: [SPARK-16693][SPARKR] Remove methods deprecated
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22843 merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.en...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22865 **[Test build #98142 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98142/testReport)** for PR 22865 at commit [`af8a85a`](https://github.com/apache/spark/commit/af8a85ae4a1e477801bf104af6d4909cd822ba01). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98141/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.en...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22865 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98141/testReport)** for PR 22784 at commit [`3cbe017`](https://github.com/apache/spark/commit/3cbe017c640764db0fe95bcc2a820917bbc5fb3e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.en...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22865 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22865: [DOC] Fix doc for spark.sql.parquet.recordLevelFi...
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/22865 [DOC] Fix doc for spark.sql.parquet.recordLevelFilter.enabled ## What changes were proposed in this pull request? Updated the doc string value for spark.sql.parquet.recordLevelFilter.enabled to indicate that spark.sql.parquet.enableVectorizedReader must be disabled. The code in ParquetFileFormat uses spark.sql.parquet.recordLevelFilter.enabled only after falling back to parquet-mr (see else for this if statement): https://github.com/apache/spark/blob/d5573c578a1eea9ee04886d9df37c7178e67bb30/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L412 https://github.com/apache/spark/blob/d5573c578a1eea9ee04886d9df37c7178e67bb30/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L427-L430 Tests also bear this out. ## How was this patch tested? This is just a doc string fix: I built Spark and ran a single test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bersprockets/spark confdocfix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22865.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22865 commit af8a85ae4a1e477801bf104af6d4909cd822ba01 Author: Bruce Robbins Date: 2018-10-27T21:47:50Z update doc string for spark.sql.parquet.recordLevelFilter.enabled --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98140 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98140/testReport)** for PR 22784 at commit [`5674e17`](https://github.com/apache/spark/commit/5674e177b7894d61904c6748dbf7721359163938). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98140/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21157 I meant to use https://github.com/apache/spark/blob/a97001d21757ae214c86371141bd78a376200f66/python/pyspark/serializers.py#L583 Instead of https://github.com/apache/spark/blob/a97001d21757ae214c86371141bd78a376200f66/python/pyspark/serializers.py#L561 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98141/testReport)** for PR 22784 at commit [`3cbe017`](https://github.com/apache/spark/commit/3cbe017c640764db0fe95bcc2a820917bbc5fb3e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98140 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98140/testReport)** for PR 22784 at commit [`5674e17`](https://github.com/apache/spark/commit/5674e177b7894d61904c6748dbf7721359163938). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22809 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22809 **[Test build #98139 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98139/testReport)** for PR 22809 at commit [`07205de`](https://github.com/apache/spark/commit/07205dea343539cb812622205fd0534b77f183d0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22809 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4560/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22809: [SPARK-19851][SQL] Add support for EVERY and ANY (SOME) ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22809 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22809: [SPARK-19851][SQL] Add support for EVERY and ANY ...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22809#discussion_r228725645 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/UnevaluableAggs.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions.aggregate + +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.types._ + +abstract class UnevaluableBooleanAggBase(arg: Expression) --- End diff -- @cloud-fan @mgaido91 Thank you. I have added a TODO for now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...
Github user superbobry commented on the issue: https://github.com/apache/spark/pull/21157 > I think people do defined NamedTuples in Notebooks, so I'm going to stick with -1. @holdenk I understand your point, but there is still something we can do without breaking existing code relying on namedtuple serialization. Option 1: switch to cloudpickle as suggested by @HyukjinKwon. Option 2: #21180. What would be your choice between the two? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98138/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22863 **[Test build #98138 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98138/testReport)** for PR 22863 at commit [`ddcab50`](https://github.com/apache/spark/commit/ddcab50d458dbfad843f74d55aedc51da5c3b6d0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228724594 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,21 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65535") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) +// Eigen values should not be negative +assert(!pca.explainedVariance.values.exists(_ < 0)) + +// Norm of the principle component should be 1.0 --- End diff -- Nit: principle -> principal --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228724541 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -384,18 +384,28 @@ class RowMatrix @Since("1.0.0") ( val n = numCols().toInt require(k > 0 && k <= n, s"k = $k out of range (0, n = $n]") -val Cov = computeCovariance().asBreeze.asInstanceOf[BDM[Double]] +if (n > 65535) { + val svd = computeSVD(k) + val s = svd.s.toArray.map(eigValue => eigValue * eigValue / (n - 1)) --- End diff -- Right, make sense. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22861: [SPARK-25663][SPARK-25661][SQL][TEST] Refactor BuiltInDa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22861 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228724515 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/PCA.scala --- @@ -49,7 +50,16 @@ class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int) { "Try reducing the parameter k for PCA, or reduce the input feature " + "vector dimension to make this tractable.") -val mat = new RowMatrix(sources) +val mat = if (numFeatures > 65535) { + val meanVector = Statistics.colStats(sources).mean --- End diff -- Rather than call `.toArray` and `.zipped` below, can this not be written as Vector - Vector in the loop below? might be more efficient. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228724667 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,14 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65500") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) --- End diff -- Is there an easy dummy test case we can write where we know what the first PC should be? like if you generate a bunch of vectors like (a +/- epsilon, a +/- epsilon, ...) for many a, the principal component should be (1,1,1...) nearly right? is that easy enough to add as a trivial test of the actual analysis? I think that would really prove it, though you manual test suggests it's working. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22784#discussion_r228724555 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/feature/PCASuite.scala --- @@ -54,4 +55,21 @@ class PCASuite extends SparkFunSuite with MLlibTestSparkContext { // check overflowing assert(PCAUtil.memoryCost(4, 6) > Int.MaxValue) } + + test("number of features more than 65535") { +val rows = 10 +val columns = 10 +val k = 5 +val randomRDD = RandomRDDs.normalVectorRDD(sc, rows, columns, 0, 0) +val pca = new PCA(k).fit(randomRDD) +assert(pca.explainedVariance.size === 5) +assert(pca.pc.numRows === 10 && pca.pc.numCols === 5) +// Eigen values should not be negative +assert(!pca.explainedVariance.values.exists(_ < 0)) --- End diff -- You can write `.forAll(_ >= 0)` too, but doesn't matter --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21157: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...
Github user superbobry commented on the issue: https://github.com/apache/spark/pull/21157 @HyukjinKwon do you mean change the default serializer to cloudpickle and remove _hack_namedtuple? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22861: [SPARK-25663][SPARK-25661][SQL][TEST] Refactor BuiltInDa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22861 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98133/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22861: [SPARK-25663][SPARK-25661][SQL][TEST] Refactor BuiltInDa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22861 **[Test build #98133 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98133/testReport)** for PR 22861 at commit [`81fe383`](https://github.com/apache/spark/commit/81fe383d4f1189c3a4a7bae32f8ca38d123e6d7d). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait DataSourceWriteBenchmark extends SqlBasedBenchmark ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22863 **[Test build #98138 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98138/testReport)** for PR 22863 at commit [`ddcab50`](https://github.com/apache/spark/commit/ddcab50d458dbfad843f74d55aedc51da5c3b6d0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4559/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21950: [SPARK-24914][SQL] Add configuration to avoid OOM during...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21950 **[Test build #98137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98137/testReport)** for PR 21950 at commit [`ddfe945`](https://github.com/apache/spark/commit/ddfe945ef161e59fc2bbc1a12bf40563d2bdd400). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [Minor][WEBUI] Remove refresh interval parameter from th...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22864 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [Minor][WEBUI] Remove refresh interval parameter from th...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22864 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [Minor][WEBUI] Remove refresh interval parameter from th...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22864 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98136/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22863 **[Test build #98136 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98136/testReport)** for PR 22863 at commit [`3109c21`](https://github.com/apache/spark/commit/3109c213c2f875ea7099929621a3be18b5f02862). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public class JavaPrefixSpanExample ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22864: [Minor][WEBUI] Remove refresh interval parameter ...
GitHub user shahidki31 opened a pull request: https://github.com/apache/spark/pull/22864 [Minor][WEBUI] Remove refresh interval parameter from the headerSparkPage method. ## What changes were proposed in this pull request? 'refreshInterval' is not used any where in the headerSparkPage method. So, we don't need to pass the parameter while calling the 'headerSparkPage' method. ## How was this patch tested? Existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/shahidki31/spark unusedCode Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22864.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22864 commit cd6f3ba922c96fc2f00871d36362bdecb84344a4 Author: Shahid Date: 2018-10-27T18:49:46Z Remove refresh interval from headerSparkPage --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98135/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22784: [SPARK-25790][MLLIB] PCA: Support more than 65535 column...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22784 **[Test build #98135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98135/testReport)** for PR 22784 at commit [`a8c4391`](https://github.com/apache/spark/commit/a8c43919a5d8624a5a5ddf7ea862a93f2db098c6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22863 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4558/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/22863 @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22863: [SPARK-25859][ML]add scala/java/python example and doc f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22863 **[Test build #98136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98136/testReport)** for PR 22863 at commit [`3109c21`](https://github.com/apache/spark/commit/3109c213c2f875ea7099929621a3be18b5f02862). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22863: [SPARK-25859][ML]add scala/java/python example an...
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/22863 [SPARK-25859][ML]add scala/java/python example and doc for PrefixSpan ## What changes were proposed in this pull request? add scala/java/python example and doc for PrefixSpan in branch 2.4 ## How was this patch tested? Manually tested You can merge this pull request into a Git repository by running: $ git pull https://github.com/huaxingao/spark mydocbranch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22863.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22863 commit 3109c213c2f875ea7099929621a3be18b5f02862 Author: Huaxin Gao Date: 2018-10-27T18:14:36Z [SPARK-25859][ML]add scala/java/python example and doc for PrefixSpan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22847 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98132/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22847 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22847 **[Test build #98132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98132/testReport)** for PR 22847 at commit [`0db224f`](https://github.com/apache/spark/commit/0db224f0eebc52a8fc1dc47fa03ff78151b3b6d9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98131/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21732 **[Test build #98131 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98131/testReport)** for PR 21732 at commit [`79d10c1`](https://github.com/apache/spark/commit/79d10c1ebc7b29a7d05bc1fb71dd543eab23db24). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22847 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98130/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22847 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22847 **[Test build #98130 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98130/testReport)** for PR 22847 at commit [`b578dd4`](https://github.com/apache/spark/commit/b578dd45cb4e6831a4bb54ba4c0d9c8f5c84fec5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org