[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22580 **[Test build #96788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96788/testReport)** for PR 22580 at commit [`391d06a`](https://github.com/apache/spark/commit/391d06a6be504a9f0e8068ea95237165842271e6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221417293 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- Then each function can be in different trait...I don't think that `runBenchmarkWithCodegen` has much in common with `runBenchmarkWithParquetPushDown`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22580 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221416957 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- Maybe we can add more common functions in the future. e.g. `runBenchmarkWithCodegen`, `runBenchmarkWithParquetPushDown`, `runBenchmarkWithOrcPushDown`... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22589: [SPARK-25572][SPARKR] test only if not cran
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22589 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3588/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22589: [SPARK-25572][SPARKR] test only if not cran
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22589 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22589: [SPARK-25572][SPARKR] test only if not cran
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22589 **[Test build #96787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96787/testReport)** for PR 22589 at commit [`3b7414d`](https://github.com/apache/spark/commit/3b7414d9adf55ef74ce2d81403e570e5d6951a05). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22589: [SPARK-25572][SPARKR] test only if not cran
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22589 diff w/o whitespace for actual change --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22589: [SPARK-25572][SPARKR] test only if not cran
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/22589 [SPARK-25572][SPARKR] test only if not cran ## What changes were proposed in this pull request? CRAN doesn't seem to respect the system requirements as running tests - we have seen cases where SparkR is run on Java 10, which unfortunately Spark does not start on. For 2.4, lets attempt skipping all tests ## How was this patch tested? manual, jenkins, appveyor You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark ralltests Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22589.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22589 commit 3b7414d9adf55ef74ce2d81403e570e5d6951a05 Author: Felix Cheung Date: 2018-09-29T05:10:23Z test only if not cran --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22379#discussion_r221415711 --- Diff: R/pkg/R/functions.R --- @@ -2203,6 +2209,23 @@ setMethod("from_json", signature(x = "Column", schema = "characterOrstructType") column(jc) }) +#' @details +#' \code{from_csv}: Parses a column containing a CSV string into a Column of \code{structType} +#' with the specified \code{schema}. +#' If the string is unparseable, the Column will contain the value NA. +#' +#' @rdname column_collection_functions +#' @aliases from_csv from_csv,Column,character-method +#' @note from_csv since 2.5.0 --- End diff -- consider adding example as in https://github.com/apache/spark/pull/22379/files#diff-d97f9adc2dcac0703568c799ff106987R2180? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22455#discussion_r221416198 --- Diff: R/pkg/tests/fulltests/test_eager_execution.R --- @@ -21,11 +21,7 @@ context("Show SparkDataFrame when eager execution is enabled.") test_that("eager execution is not enabled", { # Start Spark session without eager execution enabled - sparkSession <- if (windows_with_hadoop()) { -sparkR.session(master = sparkRTestMaster) - } else { -sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE) - } + sparkR.session(master = sparkRTestMaster) --- End diff -- as mentioned here https://github.com/apache/spark/pull/22455#discussion_r220030686 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22455#discussion_r221416164 --- Diff: R/pkg/tests/fulltests/test_eager_execution.R --- @@ -21,11 +21,7 @@ context("Show SparkDataFrame when eager execution is enabled.") test_that("eager execution is not enabled", { # Start Spark session without eager execution enabled - sparkSession <- if (windows_with_hadoop()) { -sparkR.session(master = sparkRTestMaster) - } else { -sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE) - } + sparkR.session(master = sparkRTestMaster) --- End diff -- you should definitely set `enableHiveSupport = FALSE` - historically this hasn't work well in other R tests when hive catalog is enabled --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22455#discussion_r221415826 --- Diff: R/pkg/R/DataFrame.R --- @@ -246,30 +248,38 @@ setMethod("showDF", #' @note show(SparkDataFrame) since 1.4.0 setMethod("show", "SparkDataFrame", function(object) { -allConf <- sparkR.conf() -if (!is.null(allConf[["spark.sql.repl.eagerEval.enabled"]]) && -identical(allConf[["spark.sql.repl.eagerEval.enabled"]], "true")) { - argsList <- list() - argsList$x <- object - if (!is.null(allConf[["spark.sql.repl.eagerEval.maxNumRows"]])) { -numRows <- as.numeric(allConf[["spark.sql.repl.eagerEval.maxNumRows"]]) -if (numRows > 0) { - argsList$numRows <- numRows +showFunc <- getOption("sparkr.SparkDataFrame.base_show_func") --- End diff -- hmm, this naming convention? typically we don't mix `.` and `_` and I don't think we have anything with `SparkDataFrame` seems like we have `spark.sparkr.something` before https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/core/src/main/scala/org/apache/spark/api/r/RRunner.scala#L343 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221416202 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- How about `CodegenBenchmarkBase` ? This is the best I can think of.. @wangyum @dongjoon-hyun @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22586 This was introduced by AccumulatorV2. It might be a blocker issue for Spark 2.4, since this could return an incorrect result. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22586 LGTM cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96783/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22586: [SPARK-25568][Core]Continue to update the remaini...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22586#discussion_r221415820 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -1880,6 +1880,26 @@ class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with TimeLi assert(sc.parallelize(1 to 10, 2).count() === 10) } + test("misbehaved accumulator should not impact other accumulators") { --- End diff -- Also verify the log message? See the example: https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizerLoggingSuite.scala#L41-L52 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22522 **[Test build #96783 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96783/testReport)** for PR 22522 at commit [`20668ad`](https://github.com/apache/spark/commit/20668adc843815e97b3d1950ed2ab86bdb66540f). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221415701 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- @dongjoon-hyun in https://github.com/apache/spark/pull/22522 I feel that it would be better to have a example refactoring, thus we can see how the new trait is used. We can move back to https://github.com/apache/spark/pull/22522 . I am OK either way. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221415642 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- Actually I don't think the the name `SqlBasedBenchmark` is not appropriate..From the naming we can't tell it is about benchmarking with/without whole codegen. I will try to come up with a better name. Or we can discuss in this thread. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22461 **[Test build #4354 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4354/testReport)** for PR 22461 at commit [`f6274a5`](https://github.com/apache/spark/commit/f6274a50177e18be7b36d87913c44103f2fa02d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22588 **[Test build #96786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96786/testReport)** for PR 22588 at commit [`56db1e0`](https://github.com/apache/spark/commit/56db1e0466e3b6625d4cb11d0da66f8c7eeddfaa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22588 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96786/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22588 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22588 **[Test build #96786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96786/testReport)** for PR 22588 at commit [`56db1e0`](https://github.com/apache/spark/commit/56db1e0466e3b6625d4cb11d0da66f8c7eeddfaa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22588 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22588 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3587/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22588 cc @rvesse @mccheah @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22588: [SPARK-25262][DOC][FOLLOWUP] Fix link tags in htm...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22588 [SPARK-25262][DOC][FOLLOWUP] Fix link tags in html table ## What changes were proposed in this pull request? Markdown links are not working inside html table. We should use html link tag. ## How was this patch tested? Verified in IntelliJ IDEA's markdown editor and online markdown editor. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-25262-followup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22588.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22588 commit 56db1e0466e3b6625d4cb11d0da66f8c7eeddfaa Author: Liang-Chi Hsieh Date: 2018-09-29T03:46:33Z Use html syntax for links. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22587 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22587 Merged to master, branch-2.4 and branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22587#discussion_r221414597 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -206,7 +206,7 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { object PROCESS_TABLES extends QueryTest with SQLTestUtils { // Tests the latest version of every release line. - val testingVersions = Seq("2.1.3", "2.2.2", "2.3.1") + val testingVersions = Seq("2.1.3", "2.2.2", "2.3.2") --- End diff -- @dongjoon-hyun, can you update release guide as well in spark-website? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22581: [SPARK-25565][BUILD] Add scalastyle rule to check add Lo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22581 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22581: [SPARK-25565][BUILD] Add scalastyle rule to check add Lo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22581 **[Test build #96785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96785/testReport)** for PR 22581 at commit [`d48825d`](https://github.com/apache/spark/commit/d48825d9469fa8c9d360620db9183f1ec949f67c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22581: [SPARK-25565][BUILD] Add scalastyle rule to check add Lo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22581 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3586/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22514 **[Test build #96784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96784/testReport)** for PR 22514 at commit [`1223178`](https://github.com/apache/spark/commit/122317891cb2794d67b5e11157a43afaa25edbba). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22514 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3585/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22514 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22584: [SPARK-25262][DOC][FOLLOWUP] Fix missing markup t...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22584 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22584: [SPARK-25262][DOC][FOLLOWUP] Fix missing markup tag
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22584 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21688: [SPARK-21809] : Change Stage Page to use datatables to s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21688 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96780/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21688: [SPARK-21809] : Change Stage Page to use datatables to s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21688 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21688: [SPARK-21809] : Change Stage Page to use datatables to s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21688 **[Test build #96780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96780/testReport)** for PR 21688 at commit [`a91b306`](https://github.com/apache/spark/commit/a91b306d2ddbe977e7086d4392f5a9d292264b7c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22582: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes c...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22582#discussion_r221413500 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -556,7 +556,7 @@ class Analyzer( // Group-by expressions coming from SQL are implicit and need to be deduced. val groupByExprs = groupByExprsOpt.getOrElse { val pivotColAndAggRefs = -(pivotColumn.references ++ aggregates.flatMap(_.references)).toSet + aggregates.map(_.references).foldLeft(pivotColumn.references)(_ ++ _) --- End diff -- Good catch! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22582: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes c...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22582#discussion_r221413483 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -556,7 +556,7 @@ class Analyzer( // Group-by expressions coming from SQL are implicit and need to be deduced. val groupByExprs = groupByExprsOpt.getOrElse { val pivotColAndAggRefs = -(pivotColumn.references ++ aggregates.flatMap(_.references)).toSet + aggregates.map(_.references).foldLeft(pivotColumn.references)(_ ++ _) --- End diff -- `pivotColumn.references ++ AttributeSet(aggregates)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22587 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96781/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22587 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22587 **[Test build #96781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96781/testReport)** for PR 22587 at commit [`c4ad536`](https://github.com/apache/spark/commit/c4ad536a30a5fe446938e01b961d445ef3b276fa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96779/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22580 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96782/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22580 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22580 **[Test build #96782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96782/testReport)** for PR 22580 at commit [`391d06a`](https://github.com/apache/spark/commit/391d06a6be504a9f0e8068ea95237165842271e6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22586 **[Test build #96779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96779/testReport)** for PR 22586 at commit [`f582c34`](https://github.com/apache/spark/commit/f582c34555375e177d9a5e14437176a2d776882b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96778/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22586 **[Test build #96778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96778/testReport)** for PR 22586 at commit [`a3d4a0d`](https://github.com/apache/spark/commit/a3d4a0d5f32f3ef39dfaddbe8f2811207aba403f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r221411618 --- Diff: docs/sql-programming-guide.md --- @@ -1489,7 +1489,7 @@ See the [Apache Avro Data Source Guide](avro-data-source-guide.html). * The JDBC driver class must be visible to the primordial class loader on the client session and on all executors. This is because Java's DriverManager class does a security check that results in it ignoring all drivers not visible to the primordial class loader when one goes to open a connection. One convenient way to do this is to modify compute_classpath.sh on all worker nodes to include your driver JARs. * Some databases, such as H2, convert all names to upper case. You'll need to use upper case to refer to those names in Spark SQL. - + * Users can specify vendor-specific JDBC connection properties in the data source options to do special treatment. For example, `spark.read.format("jdbc").option("url", oracleJdbcUrl).option("oracle.jdbc.mapDateToTimestamp", "false")`. `oracle.jdbc.mapDateToTimestamp` defaults to true, users often need to disable this flag to avoid Oracle date being resolved as timestamp. --- End diff -- @maropu @gatorsmile Could you please suggest something else to do here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96776/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96777/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22586 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22586 **[Test build #96776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96776/testReport)** for PR 22586 at commit [`badb471`](https://github.com/apache/spark/commit/badb4711caff6e3d10ba9037f71c1ad6515577e8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22586: [SPARK-25568][Core]Continue to update the remaining accu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22586 **[Test build #96777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96777/testReport)** for PR 22586 at commit [`e33788e`](https://github.com/apache/spark/commit/e33788e2df25be67cdd2831082d2999104a7c740). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22577: [CORE][MINOR] Fix obvious error and compiling for...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22577#discussion_r221410603 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/OneApplicationResource.scala --- @@ -175,7 +175,7 @@ private[v1] class OneApplicationAttemptResource extends AbstractApplicationResou def getAttempt(): ApplicationAttemptInfo = { uiRoot.getApplicationInfo(appId) .flatMap { app => -app.attempts.filter(_.attemptId == attemptId).headOption --- End diff -- @sadhen Since this is a general code fix for `master/branch-2.4/branch-2.3`. Can we update the title without mentioning `compiling for Scala 2.12.7`? > branch-2.2 is fine, but branch-2.3 should be fixed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22587 Thank you for review, @wangyum . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22587 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22582: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes c...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22582#discussion_r221409258 --- Diff: sql/core/src/test/resources/sql-tests/inputs/pivot.sql --- @@ -297,3 +297,13 @@ PIVOT ( sum(earnings) FOR course IN ('dotNET', 'Java') ); + +-- correctly handle pivot columns with different cases +SELECT * FROM ( + SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, "x" as x, "d" as d, "w" as w + FROM courseSales +) +PIVOT ( + sum(Earnings) --- End diff -- You can update the comment at line 291, too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22582: [SPARK-25505][SQL][FOLLOWUP] Fix for attributes c...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22582#discussion_r221409166 --- Diff: sql/core/src/test/resources/sql-tests/inputs/pivot.sql --- @@ -297,3 +297,13 @@ PIVOT ( sum(earnings) FOR course IN ('dotNET', 'Java') ); + +-- correctly handle pivot columns with different cases +SELECT * FROM ( + SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, "x" as x, "d" as d, "w" as w + FROM courseSales +) +PIVOT ( + sum(Earnings) --- End diff -- Nice catch, @mgaido91 . Instead of adding this new test case, please change `sum(earnings)` to `sum(Earnings)` in line 297. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22580: [SPARK-25508][SQL][TEST] Refactor OrcReadBenchmark to us...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22580 **[Test build #96782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96782/testReport)** for PR 22580 at commit [`391d06a`](https://github.com/apache/spark/commit/391d06a6be504a9f0e8068ea95237165842271e6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22522 **[Test build #96783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96783/testReport)** for PR 22522 at commit [`20668ad`](https://github.com/apache/spark/commit/20668adc843815e97b3d1950ed2ab86bdb66540f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3584/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22574: [SPARK-25559][SQL] Remove the unsupported predica...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22574 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22522 cc @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22522: [SPARK-25510][TEST] Create new trait replace Benc...
GitHub user wangyum reopened a pull request: https://github.com/apache/spark/pull/22522 [SPARK-25510][TEST] Create new trait replace BenchmarkWithCodegen ## What changes were proposed in this pull request? We need create a new trait to replace `BenchmarkWithCodegen` as `BenchmarkWithCodegen` extends from `SparkFunSuite`. For example. when doing `AggregateBenchmark` refactor. Before this change: ```scala object AggregateBenchmark extends BenchmarkBase { lazy val sparkSession = SparkSession.builder .master("local[1]") .appName(this.getClass.getSimpleName) .config("spark.sql.shuffle.partitions", 1) .config("spark.sql.autoBroadcastJoinThreshold", 1) .getOrCreate() /** Runs function `f` with whole stage codegen on and off. */ def runBenchmark(name: String, cardinality: Long)(f: => Unit): Unit = { val benchmark = new Benchmark(name, cardinality, output = output) benchmark.addCase(s"$name wholestage off", numIters = 2) { iter => sparkSession.conf.set("spark.sql.codegen.wholeStage", value = false) f } benchmark.addCase(s"$name wholestage on", numIters = 5) { iter => sparkSession.conf.set("spark.sql.codegen.wholeStage", value = true) f } benchmark.run() } override def benchmark(): Unit = { runBenchmark("aggregate without grouping") { val N = 500L << 22 runBenchmark("agg w/o group", N) { sparkSession.range(N).selectExpr("sum(id)").collect() } } ... ``` After this change: ```scala object AggregateBenchmark extends SqlBasedBenchmark { override def benchmark(): Unit = { runBenchmark("aggregate without grouping") { val N = 500L << 22 runBenchmark("agg w/o group", N) { sparkSession.range(N).selectExpr("sum(id)").collect() } } ... ``` All these benchmarks will use this trait: ``` AggregateBenchmark BenchmarkWideTable JoinBenchmark MiscBenchmark ObjectHashAggregateExecBenchmark SortBenchmark UnsafeArrayDataBenchmark ``` ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-25510 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22522.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22522 commit 275cc6c5f8f106eb339c7ed01734e279a223705e Author: Yuming Wang Date: 2018-09-21T17:36:57Z Create new BenchmarkWithCodegen trait doesn't extends SparkFunSuite commit c62a3be61cdac0e3dffbd5d0ce4a431c2ef2f931 Author: Yuming Wang Date: 2018-09-29T00:26:47Z Merge remote-tracking branch 'upstream/master' into SPARK-25510 commit 20668adc843815e97b3d1950ed2ab86bdb66540f Author: Yuming Wang Date: 2018-09-29T00:47:28Z Rename RunBenchmarkWithCodegen to SqlBasedBenchmark --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22574: [SPARK-25559][SQL] Remove the unsupported predicates in ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22574 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22574: [SPARK-25559][SQL] Remove the unsupported predicates in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22574 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22574: [SPARK-25559][SQL] Remove the unsupported predicates in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22574 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96775/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22574: [SPARK-25559][SQL] Remove the unsupported predicates in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22574 **[Test build #96775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96775/testReport)** for PR 22574 at commit [`9a9e47f`](https://github.com/apache/spark/commit/9a9e47fb242afb94e7df917c852425cc0f5114e0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22587 **[Test build #96781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96781/testReport)** for PR 22587 at commit [`c4ad536`](https://github.com/apache/spark/commit/c4ad536a30a5fe446938e01b961d445ef3b276fa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22587 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3583/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22587 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22587: [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22587 [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? This PR aims to prevent test slowdowns at `HiveExternalCatalogVersionsSuite` by using the latest Apache Spark 2.3.2 link because the Apache mirrors will remove the old Spark 2.3.1 binaries eventually. `HiveExternalCatalogVersionsSuite` will not fail because [SPARK-24813](https://issues.apache.org/jira/browse/SPARK-24813) implements a fallback logic. However, it will cause many trials in all builds over `branch-2.3/branch-2.4/master`. We had better fix this issue. ## How was this patch tested? Pass the Jenkins with the updated version. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-25570 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22587.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22587 commit c4ad536a30a5fe446938e01b961d445ef3b276fa Author: Dongjoon Hyun Date: 2018-09-29T00:11:36Z [SPARK-25570][SQL][TEST] Replace 2.3.1 with 2.3.2 in HiveExternalCatalogVersionsSuite --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22574: [SPARK-25559][SQL] Remove the unsupported predicates in ...
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/22574 I changed the title, and hopefully, it's much more clear now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22473 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22473: [SPARK-25449][CORE] Heartbeat shouldn't include accumula...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22473 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22566: [SPARK-25458][SQL] Support FOR ALL COLUMNS in ANALYZE TA...
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/22566 Thank you very much @gatorsmile @dongjoon-hyun @juliuszsompolski --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22473: [SPARK-25449][CORE] Heartbeat shouldn't include accumula...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22473 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22473: [SPARK-25449][CORE] Heartbeat shouldn't include accumula...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22473 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96771/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22473: [SPARK-25449][CORE] Heartbeat shouldn't include accumula...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22473 **[Test build #96771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96771/testReport)** for PR 22473 at commit [`f6fa337`](https://github.com/apache/spark/commit/f6fa33790769c14d9dde6f56e07233c2887d80a6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22237 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22237 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96772/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22237 **[Test build #96772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96772/testReport)** for PR 22237 at commit [`1f052c4`](https://github.com/apache/spark/commit/1f052c47276193280bfc08028b65e32c81f0c022). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221401306 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- So, if @gengliangwang agree with that, `SqlBasedBenchmark` is another refactoring (renaming and improvement) like `[SPARK-25499][TEST] Refactor BenchmarkBase and Benchmark`. Could you do that in a separate PR in advance? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22580: [SPARK-25508][SQL] Refactor OrcReadBenchmark to u...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22580#discussion_r221398055 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala --- @@ -436,49 +331,36 @@ object OrcReadBenchmark extends SQLHelper { spark.sql(s"SELECT sum(c$middle) FROM hiveOrcTable").collect() } -/* -Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27 on Mac OS X 10.13.1 -Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz - -Single Column Scan from 100 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 1050 / 1053 1.01001.1 1.0X -Native ORC Vectorized 95 / 101 11.0 90.9 11.0X -Native ORC Vectorized with copy 95 / 102 11.0 90.9 11.0X -Hive built-in ORC 348 / 358 3.0 331.8 3.0X - -Single Column Scan from 200 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 2099 / 2108 0.52002.1 1.0X -Native ORC Vectorized 179 / 187 5.8 171.1 11.7X -Native ORC Vectorized with copy176 / 188 6.0 167.6 11.9X -Hive built-in ORC 562 / 581 1.9 535.9 3.7X - -Single Column Scan from 300 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 3221 / 3246 0.33071.4 1.0X -Native ORC Vectorized 312 / 322 3.4 298.0 10.3X -Native ORC Vectorized with copy306 / 320 3.4 291.6 10.5X -Hive built-in ORC 815 / 824 1.3 777.3 4.0X -*/ benchmark.run() } } } - def main(args: Array[String]): Unit = { -Seq(ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType).foreach { dataType => - numericScanBenchmark(1024 * 1024 * 15, dataType) + override def benchmark(): Unit = { +runBenchmark("SQL Single Column Scan") { --- End diff -- Also, please add `[TEST]` into the title. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22580: [SPARK-25508][SQL] Refactor OrcReadBenchmark to u...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22580#discussion_r221397970 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala --- @@ -436,49 +331,36 @@ object OrcReadBenchmark extends SQLHelper { spark.sql(s"SELECT sum(c$middle) FROM hiveOrcTable").collect() } -/* -Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27 on Mac OS X 10.13.1 -Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz - -Single Column Scan from 100 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 1050 / 1053 1.01001.1 1.0X -Native ORC Vectorized 95 / 101 11.0 90.9 11.0X -Native ORC Vectorized with copy 95 / 102 11.0 90.9 11.0X -Hive built-in ORC 348 / 358 3.0 331.8 3.0X - -Single Column Scan from 200 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 2099 / 2108 0.52002.1 1.0X -Native ORC Vectorized 179 / 187 5.8 171.1 11.7X -Native ORC Vectorized with copy176 / 188 6.0 167.6 11.9X -Hive built-in ORC 562 / 581 1.9 535.9 3.7X - -Single Column Scan from 300 columns: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Native ORC MR 3221 / 3246 0.33071.4 1.0X -Native ORC Vectorized 312 / 322 3.4 298.0 10.3X -Native ORC Vectorized with copy306 / 320 3.4 291.6 10.5X -Hive built-in ORC 815 / 824 1.3 777.3 4.0X -*/ benchmark.run() } } } - def main(args: Array[String]): Unit = { -Seq(ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType).foreach { dataType => - numericScanBenchmark(1024 * 1024 * 15, dataType) + override def benchmark(): Unit = { +runBenchmark("SQL Single Column Scan") { --- End diff -- nit `SQL Single Column Scan` -> `SQL Single Numeric Column Scan`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22484: [SPARK-25476][SPARK-25510][TEST] Refactor Aggrega...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22484#discussion_r221397904 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.plans.SQLHelper +import org.apache.spark.sql.internal.SQLConf + +/** + * Common base trait to run benchmark with the Dataset and DataFrame API. + */ +trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper { --- End diff -- I think we can remove `BenchmarkWithCodegen` after all refactor finished. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org