[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78561/testReport)** for PR 17222 at commit [`da71c93`](https://github.com/apache/spark/commit/da71c938a401a2e11ba61a9afe05ba8c689b98b1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78560/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78559/testReport)** for PR 18412 at commit [`6ad657c`](https://github.com/apache/spark/commit/6ad657c211a1f06fad6f6a33cdcb77cc67141e27). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18412 LGTM, pending test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/18410 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18106: [SPARK-20754][SQL] Support TRUNC (number)
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/18106#discussion_r123871837 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MiscExpressionsSuite.scala --- @@ -44,4 +46,49 @@ class MiscExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { assert(evaluate(Uuid()) !== evaluate(Uuid())) } + test("trunc") { +// numeric +def testTruncNumber(input: Double, fmt: Int, expected: Double): Unit = { + checkEvaluation(Trunc(Literal.create(input, DoubleType), +Literal.create(fmt, IntegerType)), +expected) + checkEvaluation(Trunc(Literal.create(input, DoubleType), +NonFoldableLiteral.create(fmt, IntegerType)), +expected) +} + +testTruncNumber(1234567891.1234567891, 4, 1234567891.1234) +testTruncNumber(1234567891.1234567891, -4, 123456) +testTruncNumber(1234567891.1234567891, 0, 1234567891) --- End diff -- Also check testTruncNumber(0.1234567891, -1, 0.1234567891)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18106: [SPARK-20754][SQL] Support TRUNC (number)
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/18106#discussion_r123871820 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MiscExpressionsSuite.scala --- @@ -44,4 +46,49 @@ class MiscExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { assert(evaluate(Uuid()) !== evaluate(Uuid())) } + test("trunc") { +// numeric +def testTruncNumber(input: Double, fmt: Int, expected: Double): Unit = { + checkEvaluation(Trunc(Literal.create(input, DoubleType), +Literal.create(fmt, IntegerType)), +expected) + checkEvaluation(Trunc(Literal.create(input, DoubleType), +NonFoldableLiteral.create(fmt, IntegerType)), +expected) +} + +testTruncNumber(1234567891.1234567891, 4, 1234567891.1234) +testTruncNumber(1234567891.1234567891, -4, 123456) +testTruncNumber(1234567891.1234567891, 0, 1234567891) + +checkEvaluation(Trunc(Literal.create(1D, DoubleType), + NonFoldableLiteral.create(null, IntegerType)), + null) +checkEvaluation(Trunc(Literal.create(null, DoubleType), + NonFoldableLiteral.create(1, IntegerType)), + null) +checkEvaluation(Trunc(Literal.create(null, DoubleType), + NonFoldableLiteral.create(null, IntegerType)), + null) + +// date --- End diff -- Shall we split this test into two tests for numeric and date respectively? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871674 --- Diff: sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java --- @@ -31,7 +31,7 @@ import org.apache.spark.sql.expressions.UserDefinedAggregateFunction; import static org.apache.spark.sql.functions.*; import org.apache.spark.sql.hive.test.TestHive$; -import org.apache.spark.sql.hive.aggregate.MyDoubleSum; +import test.org.apache.spark.sql.MyDoubleSum; public class JavaDataFrameSuite { --- End diff -- do you mean move JavaDataFrameSuite to sql/core ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871670 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package org.apache.spark.sql.hive.execution import scala.collection.JavaConverters._ import scala.util.Random +import test.org.apache.spark.sql.MyDoubleAvg +import test.org.apache.spark.sql.MyDoubleSum + import org.apache.spark.sql._ import org.apache.spark.sql.catalyst.expressions.UnsafeRow import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction} import org.apache.spark.sql.functions._ -import org.apache.spark.sql.hive.aggregate.{MyDoubleAvg, MyDoubleSum} import org.apache.spark.sql.hive.test.TestHiveSingleton import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types._ + --- End diff -- I didn't add any test in this file. Or do you mean move AggregationQuerySuite.scala to sql/core ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78556/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78556/testReport)** for PR 18412 at commit [`3be3475`](https://github.com/apache/spark/commit/3be3475d3da7e281f7c1a6599988a621c4d6b0f5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18371: [SPARK-20889][SparkR] Grouped documentation for M...
GitHub user actuaryzhang reopened a pull request: https://github.com/apache/spark/pull/18371 [SPARK-20889][SparkR] Grouped documentation for MATH column methods ## What changes were proposed in this pull request? Grouped documentation for math column methods. You can merge this pull request into a Git repository by running: $ git pull https://github.com/actuaryzhang/spark sparkRDocMath Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18371.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18371 commit 1b8880d2fe31a42949a947668f2d2927a094e941 Author: actuaryzhang Date: 2017-06-20T21:44:32Z update doc for column math functions commit ee0a1f24c8a6c44770b13e9b805ca56a0bbe7f2f Author: actuaryzhang Date: 2017-06-20T21:58:26Z add examples commit 707b871160574297ef8eb75859d05d9ab13df02c Author: actuaryzhang Date: 2017-06-22T05:41:58Z add more examples and move doc for sign and ceiling commit 6d5a259f872c178f3465a8b27e3ee9a2e7b05f21 Author: Wayne Zhang Date: 2017-06-22T17:40:51Z Merge branch 'master' into sparkRDocMath commit a158539fb69f8bbebb743b8d06d91cbbef36e950 Author: actuaryzhang Date: 2017-06-22T17:45:15Z resolve conflicts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18371: [SPARK-20889][SparkR] Grouped documentation for M...
Github user actuaryzhang closed the pull request at: https://github.com/apache/spark/pull/18371 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18387: [SPARK-21174] [SQL] Validate sampling fraction in logica...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18387 oh I misread the code. Then we still have duplicated checking logic for SQL and Dataset, maybe we should put it in `CheckAnalysis`... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18397: [SPARK-21159][core] Don't try to connect to launcher in ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18397 backported to 2.2/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871158 --- Diff: sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java --- @@ -31,7 +31,7 @@ import org.apache.spark.sql.expressions.UserDefinedAggregateFunction; import static org.apache.spark.sql.functions.*; import org.apache.spark.sql.hive.test.TestHive$; -import org.apache.spark.sql.hive.aggregate.MyDoubleSum; +import test.org.apache.spark.sql.MyDoubleSum; public class JavaDataFrameSuite { --- End diff -- shall we merge this suite with `JavaDataFrameSuite` in sql/core? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871136 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package org.apache.spark.sql.hive.execution import scala.collection.JavaConverters._ import scala.util.Random +import test.org.apache.spark.sql.MyDoubleAvg +import test.org.apache.spark.sql.MyDoubleSum + import org.apache.spark.sql._ import org.apache.spark.sql.catalyst.expressions.UnsafeRow import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction} import org.apache.spark.sql.functions._ -import org.apache.spark.sql.hive.aggregate.{MyDoubleAvg, MyDoubleSum} import org.apache.spark.sql.hive.test.TestHiveSingleton import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types._ + --- End diff -- shall we move this test suite to sql/core? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871082 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/MyDoubleAvg.java --- @@ -28,6 +25,9 @@ import org.apache.spark.sql.types.StructField; import org.apache.spark.sql.types.StructType; +import java.util.ArrayList; +import java.util.List; --- End diff -- the import order is wrong here, please follow the previous style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871085 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/MyDoubleSum.java --- @@ -15,18 +15,18 @@ * limitations under the License. */ -package org.apache.spark.sql.hive.aggregate; - -import java.util.ArrayList; -import java.util.List; +package test.org.apache.spark.sql; +import org.apache.spark.sql.Row; import org.apache.spark.sql.expressions.MutableAggregationBuffer; import org.apache.spark.sql.expressions.UserDefinedAggregateFunction; -import org.apache.spark.sql.types.StructField; -import org.apache.spark.sql.types.StructType; import org.apache.spark.sql.types.DataType; import org.apache.spark.sql.types.DataTypes; -import org.apache.spark.sql.Row; +import org.apache.spark.sql.types.StructField; +import org.apache.spark.sql.types.StructType; + +import java.util.ArrayList; +import java.util.List; --- End diff -- ditoo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871068 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java --- @@ -40,6 +40,8 @@ import org.apache.spark.sql.types.*; import org.apache.spark.util.sketch.BloomFilter; import org.apache.spark.util.sketch.CountMinSketch; +import test.org.apache.spark.sql.MyDoubleAvg; +import test.org.apache.spark.sql.MyDoubleSum; --- End diff -- unnecessary change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18397: [SPARK-21159][core] Don't try to connect to launc...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18397 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123871026 --- Diff: python/pyspark/sql/tests.py --- @@ -481,6 +481,20 @@ def test_udf_registration_returns_udf(self): df.select(add_three("id").alias("plus_three")).collect() ) +def test_non_existed_udf(self): +try: +self.spark.udf.registerJavaFunction("udf1", "non_existed_udf") +self.fail("should fail due to can not load java udf class") +except AnalysisException as e: --- End diff -- shall we use `self.assertRaises` like other tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18400 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78555/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18400 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18397: [SPARK-21159][core] Don't try to connect to launcher in ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18397 LGTM, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18400 **[Test build #78555 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78555/testReport)** for PR 18400 at commit [`5cdd328`](https://github.com/apache/spark/commit/5cdd328ee9a32969377cbdbfea229cc364dbee17). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78558/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78558/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78557/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78557/testReport)** for PR 18409 at commit [`7878c49`](https://github.com/apache/spark/commit/7878c4985bc949fddefc1b93cec4ac0aff478ac8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18235: [SPARK-21012][Submit] Add glob support for resour...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18235#discussion_r123870610 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -845,32 +840,54 @@ object SparkSubmit extends CommandLineUtils { */ private[deploy] def downloadFileList( fileList: String, + targetDir: File, + sparkConf: SparkConf, + securityManager: SecurityManager, hadoopConf: HadoopConfiguration): String = { require(fileList != null, "fileList cannot be null.") -fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",") +fileList.split(",") + .map(downloadFile(_, targetDir, sparkConf, securityManager, hadoopConf)) + .mkString(",") } /** * Download a file from the remote to a local temporary directory. If the input path points to * a local path, returns it with no operation. */ - private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = { + private[deploy] def downloadFile( + path: String, + targetDir: File, --- End diff -- ditto --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18235: [SPARK-21012][Submit] Add glob support for resour...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18235#discussion_r123870606 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -845,32 +840,54 @@ object SparkSubmit extends CommandLineUtils { */ private[deploy] def downloadFileList( fileList: String, + targetDir: File, --- End diff -- Let's explain the meaning of each param. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18235: [SPARK-21012][Submit] Add glob support for resour...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18235#discussion_r123870719 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -752,25 +793,35 @@ class SparkSubmitSuite test("downloadFile - invalid url") { intercept[IOException] { - SparkSubmit.downloadFile("abc:/my/file", new Configuration()) + val sparkConf = new SparkConf() --- End diff -- Should we create a `testDownloadFile()` function and merge the duplicated test code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18408: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18408 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17830: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17830 @gaborfeher Since the PR has been merged, could you please close it? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18408: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18408 Thanks! Merging to master/2.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78558/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/18410 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78557/testReport)** for PR 18409 at commit [`7878c49`](https://github.com/apache/spark/commit/7878c4985bc949fddefc1b93cec4ac0aff478ac8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78556/testReport)** for PR 18412 at commit [`3be3475`](https://github.com/apache/spark/commit/3be3475d3da7e281f7c1a6599988a621c4d6b0f5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18373: [SPARK-20431][SS][FOLLOWUP] Specify a schema by u...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18373 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18409 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18409 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18387: [SPARK-21174] [SQL] Validate sampling fraction in logica...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/18387 @cloud-fan The error message comes from AstBuilder. The code change in SqlBase.g4 allows tablesample to process negative number, and throw ParseException in AstBuilder. We can't throw ParseException In operator level --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18373: [SPARK-20431][SS][FOLLOWUP] Specify a schema by using a ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18373 thanks, merging to master! @amoussoubaruch please post your question to dev list, instead of randomly picking a PR... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18412 cc @liancheng @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18412: Fix wrong results of insertion of Array of Struct
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/18412 Fix wrong results of insertion of Array of Struct ### What changes were proposed in this pull request? ```SQL CREATE TABLE `tab1` (`custom_fields` ARRAY>) USING parquet INSERT INTO `tab1` SELECT ARRAY(named_struct('id', 1, 'value', 'a'), named_struct('id', 2, 'value', 'b')) SELECT custom_fields.id, custom_fields.value FROM tab1 ``` The above query always return the last struct of the array, because the rule `SimplifyCasts` incorrectly rewrites the query. The underlying cause is we always use the same `GenericInternalRow` object when doing the cast. ### How was this patch tested? You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark castStruct Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18412.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18412 commit 3be3475d3da7e281f7c1a6599988a621c4d6b0f5 Author: gatorsmile Date: 2017-06-24T03:29:38Z fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18387: [SPARK-21174] [SQL] Validate sampling fraction in logica...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18387 ah check analysis also works, the problem then becomes, shall we do the checking in each operator, or do put it in a central place like `CheckAnalysis`? BTW, putting in the operator can have better error message for SQL, e.g. ``` org.apache.spark.sql.catalyst.parser.ParseException Sampling fraction (1.01) must be on interval [0, 1](line 1, pos 24) == SQL == SELECT mydb1.t1 FROM t1 TABLESAMPLE (101 PERCENT) ^^^ ``` users can know which part of their SQL goes wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18411: [SPARK-18004][SQL] Make sure the date or timestamp relat...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18411 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark History servi...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16924 no, history server is not a good use case, for long running jobs/apps, history server should ONLY detect the change and show the app after it's finished. Can you explain more about your use case for thrift-server? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/18410 Jenkins, test it please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18375: [SPARK-21144][SQL] Print a warning if the data sc...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/18375#discussion_r123869166 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.util + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.analysis._ +import org.apache.spark.sql.types.StructType + + +/** + * Utils for handling schemas. + * + * TODO: Merge this file with [[org.apache.spark.ml.util.SchemaUtils]]. + */ +private[spark] object SchemaUtils { + + // Returns true if a given resolver is case-sensitive + private def isCaseSensitiveAnalysis(resolver: Resolver): Boolean = { +if (resolver == caseSensitiveResolution) { + true +} else if (resolver == caseInsensitiveResolution) { + false +} else { + sys.error("A resolver to check if two identifiers are equal must be " + +"`caseSensitiveResolution` or `caseInsensitiveResolution` in o.a.s.sql.catalyst.") +} + } + + /** + * Checks if input column names have duplicate identifiers. This throws an exception if + * the duplication exists. + * + * @param columnNames column names to check + * @param colType column type name, used in an exception message + * @param resolver resolver used to determine if two identifiers are equal + */ + def checkColumnNameDuplication( + columnNames: Seq[String], colType: String, resolver: Resolver): Unit = { +checkColumnNameDuplication(columnNames, colType, isCaseSensitiveAnalysis(resolver)) + } + + /** + * Checks if input column names have duplicate identifiers. This throws an exception if + * the duplication exists. + * + * @param columnNames column names to check + * @param colType column type name, used in an exception message + * @param caseSensitiveAnalysis whether duplication checks should be case sensitive or not + */ + def checkColumnNameDuplication( + columnNames: Seq[String], colType: String, caseSensitiveAnalysis: Boolean): Unit = { +val names = if (caseSensitiveAnalysis) { --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18411: [SPARK-18004][SQL] Make sure the date or timestamp relat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18411 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18411: Make sure the date or timestamp related predicate...
GitHub user SharpRay opened a pull request: https://github.com/apache/spark/pull/18411 Make sure the date or timestamp related predicate can be pushed down to Oracle correctly ## What changes were proposed in this pull request? Override beforeFetch method in OracleDialect to finish the following two things: - Set Oracle's NLS_TIMESTAMP_FORMAT to "-MM-DD HH24:MI:SS.FF" to match java.sql.Timestamp format. - Set Oracle's NLS_DATE_FORMAT to "-MM-DD" to match java.sql.Date format. ## How was this patch tested? An integration test has been added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/SharpRay/spark oracle-date-timestamp-pushdown Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18411.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18411 commit 8e69b47be25f63b054dadcd8371c6386b2d6b0c3 Author: Rui Zha Date: 2017-06-24T03:03:06Z [SPARK-18006][SQL] Make sure date or timestamp related predicate is pushed down correctly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18406: [SPARK-21195] Automatically register new metrics from so...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18406 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18400 **[Test build #78555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78555/testReport)** for PR 18400 at commit [`5cdd328`](https://github.com/apache/spark/commit/5cdd328ee9a32969377cbdbfea229cc364dbee17). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18400 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17714: [SPARK-20428][Core]REST interface about 'v1/submissions/...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/17714 Let me be explicit - I don't think the improvement is really needed in Spark, as long as it's just looping the drivers and sending blocking KillDriver messages, since we gain nothing on performance issue this way. If you have a huge cluster with many drivers dying simultaneously(which, in my mind, should be really extreme case), then it's fine you write a script to call from outside of Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78554/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18313 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18313 **[Test build #78554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78554/testReport)** for PR 18313 at commit [`c7e0bcd`](https://github.com/apache/spark/commit/c7e0bcd1152c3245c1c190b71f11a22e61fd3ac5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17714: [SPARK-20428][Core]REST interface about 'v1/submissions/...
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17714 I can delete some 'drivers' in a request when the 'drivers' when these 'drivers' are dead and meaningless. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78552/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9518 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9518 **[Test build #78552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78552/testReport)** for PR 9518 at commit [`ccd3152`](https://github.com/apache/spark/commit/ccd3152af149257dc3ea7f298e27e7cbe1a1f439). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18313: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18313 **[Test build #78554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78554/testReport)** for PR 18313 at commit [`c7e0bcd`](https://github.com/apache/spark/commit/c7e0bcd1152c3245c1c190b71f11a22e61fd3ac5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18408: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18408 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18408: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18408 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78546/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18408: [SPARK-20555][SQL] Fix mapping of Oracle DECIMAL types t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18408 **[Test build #78546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78546/testReport)** for PR 18408 at commit [`d8d4fad`](https://github.com/apache/spark/commit/d8d4fadd856b81681f13eb2b1f7caab358126c59). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18364: [SPARK-21153] Use project instead of expand in tumbling ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18364 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18364: [SPARK-21153] Use project instead of expand in tumbling ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18364 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78547/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18364: [SPARK-21153] Use project instead of expand in tumbling ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18364 **[Test build #78547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78547/testReport)** for PR 18364 at commit [`c59a0de`](https://github.com/apache/spark/commit/c59a0dee87b7e3dde4154d7bb22e7219ba5cb217). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78551/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78551 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78551/testReport)** for PR 18409 at commit [`7878c49`](https://github.com/apache/spark/commit/7878c4985bc949fddefc1b93cec4ac0aff478ac8). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17583 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78553/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78550/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17583 **[Test build #78553 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78553/testReport)** for PR 17583 at commit [`6ee45ba`](https://github.com/apache/spark/commit/6ee45bad40eafd138a234f9b9c6a0782cb7aeaf4). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17583 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78550/testReport)** for PR 18409 at commit [`a349962`](https://github.com/apache/spark/commit/a349962eb04054a94de01b976b5c6217ea72519b). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78549/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78549/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18409 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78548/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78548/testReport)** for PR 18409 at commit [`42238b8`](https://github.com/apache/spark/commit/42238b862e7b6e984967139e68259d8a5caae20f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17583 **[Test build #78553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78553/testReport)** for PR 17583 at commit [`6ee45ba`](https://github.com/apache/spark/commit/6ee45bad40eafd138a234f9b9c6a0782cb7aeaf4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18195: [SPARK-20921][SQL][WIP] Support can config Oracle...
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/18195 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9518 **[Test build #78552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78552/testReport)** for PR 9518 at commit [`ccd3152`](https://github.com/apache/spark/commit/ccd3152af149257dc3ea7f298e27e7cbe1a1f439). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78551/testReport)** for PR 18409 at commit [`7878c49`](https://github.com/apache/spark/commit/7878c4985bc949fddefc1b93cec4ac0aff478ac8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18409: [SPARK-21196] Split codegen info of query plan into sequ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18409 **[Test build #78550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78550/testReport)** for PR 18409 at commit [`a349962`](https://github.com/apache/spark/commit/a349962eb04054a94de01b976b5c6217ea72519b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user xflin commented on a diff in the pull request: https://github.com/apache/spark/pull/9518#discussion_r123857033 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/StatsdReporter.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.metrics.sink + +import java.io.IOException +import java.net.{DatagramPacket, DatagramSocket, InetSocketAddress} +import java.nio.charset.StandardCharsets.UTF_8 +import java.util.SortedMap +import java.util.concurrent.TimeUnit + +import scala.collection.JavaConverters._ +import scala.util.{Failure, Success, Try} + +import com.codahale.metrics._ +import org.apache.hadoop.net.NetUtils + +import org.apache.spark.Logging + +/** + * @see https://github.com/etsy/statsd/blob/master/docs/metric_types.md";> + *StatsD metric types + */ +private[spark] sealed trait StatsdMetricType { + val COUNTER = "c" + val GAUGE = "g" + val TIMER = "ms" + val Set = "s" +} + +private[spark] class StatsdReporter( +registry: MetricRegistry, +host: String = "127.0.0.1", +port: Int = 8125, +prefix: String = "", +filter: MetricFilter = MetricFilter.ALL, +rateUnit: TimeUnit = TimeUnit.SECONDS, +durationUnit: TimeUnit = TimeUnit.MILLISECONDS) + extends ScheduledReporter(registry, "statsd-reporter", filter, rateUnit, durationUnit) + with StatsdMetricType with Logging { + + private val address = new InetSocketAddress(host, port) + private val whitespace = "[\\s]+".r + + override def report( + gauges: SortedMap[String, Gauge[_]], + counters: SortedMap[String, Counter], + histograms: SortedMap[String, Histogram], + meters: SortedMap[String, Meter], + timers: SortedMap[String, Timer]): Unit = +Try(new DatagramSocket) match { + case Failure(ioe: IOException) => logWarning("StatsD datagram socket construction failed", +NetUtils.wrapException(host, port, "0.0.0.0", 0, ioe)) + case Failure(e) => logWarning("StatsD datagram socket construction failed", e) + case Success(s) => +implicit val socket = s +val localAddress = Try(socket.getLocalAddress).map(_.getHostAddress).getOrElse(null) +val localPort = socket.getLocalPort +Try { + gauges.entrySet.asScala.foreach(e => reportGauge(e.getKey, e.getValue)) + counters.entrySet.asScala.foreach(e => reportCounter(e.getKey, e.getValue)) + histograms.entrySet.asScala.foreach(e => reportHistogram(e.getKey, e.getValue)) + meters.entrySet.asScala.foreach(e => reportMetered(e.getKey, e.getValue)) + timers.entrySet.asScala.foreach(e => reportTimer(e.getKey, e.getValue)) +} recover { + case ioe: IOException => +logDebug(s"Unable to send packets to StatsD", NetUtils.wrapException( + address.getHostString, address.getPort, localAddress, localPort, ioe)) + case e: Throwable => logDebug(s"Unable to send packets to StatsD at '$host:$port'", e) +} +Try(socket.close()) recover { + case ioe: IOException => +logDebug("Error when close socket to StatsD", NetUtils.wrapException( + address.getHostString, address.getPort, localAddress, localPort, ioe)) + case e: Throwable => logDebug("Error when close socket to StatsD", e) +} +} + + private def reportGauge(name: String, gauge: Gauge[_])(implicit socket: DatagramSocket) = +formatAny(gauge.getValue).foreach(v => send(fullName(name), v, GAUGE)) + + private def reportCounter(name: String, counter: Counter)(implicit socket: DatagramSocket) = +send(fullName(name), format(counter.getCount), COUNTER) + + private def reportHistogram(name: String, histogram: Histogram) + (implicit soc
[GitHub] spark pull request #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user xflin commented on a diff in the pull request: https://github.com/apache/spark/pull/9518#discussion_r123856611 --- Diff: core/src/test/scala/org/apache/spark/metrics/sink/StatsdSinkSuite.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.metrics.sink + +import java.net.{DatagramPacket, DatagramSocket} +import java.nio.charset.StandardCharsets.UTF_8 +import java.util.Properties +import java.util.concurrent.TimeUnit._ + +import com.codahale.metrics._ + +import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite} +import org.apache.spark.metrics.sink.StatsdSink._ + +class StatsdSinkSuite extends SparkFunSuite { + val securityMgr = new SecurityManager(new SparkConf(false)) + val defaultProps = Map( +STATSD_KEY_PREFIX -> "spark", +STATSD_KEY_PERIOD -> "1", +STATSD_KEY_UNIT -> "seconds", +STATSD_KEY_HOST -> "127.0.0.1" + ) + val socketTimeout = 3000 // milliseconds + val socketBufferSize = 1024 + + def makeFixture(): (DatagramSocket, StatsdSink) = { +val socket = new DatagramSocket +socket.setReceiveBufferSize(socketBufferSize) +socket.setSoTimeout(socketTimeout) +val props = new Properties +defaultProps.foreach(e => props.put(e._1, e._2)) +props.put(STATSD_KEY_PORT, socket.getLocalPort.toString) +val registry = new MetricRegistry +val sink = new StatsdSink(props, registry, securityMgr) +(socket, sink) + } + + test("metrics StatsD sink with Counter") { +val (socket, sink) = makeFixture() +try { + val counter = new Counter + counter.inc(12) + sink.registry.register("counter", counter) + sink.report() + + val p = new DatagramPacket(new Array[Byte](socketBufferSize), socketBufferSize) + socket.receive(p) + + val result = new String(p.getData, 0, p.getLength, UTF_8) + assert(result === "spark.counter:12|c", "Counter metric received should match data sent") +} finally socket.close() --- End diff -- See above response. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user xflin commented on a diff in the pull request: https://github.com/apache/spark/pull/9518#discussion_r123856570 --- Diff: core/src/test/scala/org/apache/spark/metrics/sink/StatsdSinkSuite.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.metrics.sink + +import java.net.{DatagramPacket, DatagramSocket} +import java.nio.charset.StandardCharsets.UTF_8 +import java.util.Properties +import java.util.concurrent.TimeUnit._ + +import com.codahale.metrics._ + +import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite} +import org.apache.spark.metrics.sink.StatsdSink._ + +class StatsdSinkSuite extends SparkFunSuite { + val securityMgr = new SecurityManager(new SparkConf(false)) + val defaultProps = Map( +STATSD_KEY_PREFIX -> "spark", +STATSD_KEY_PERIOD -> "1", +STATSD_KEY_UNIT -> "seconds", +STATSD_KEY_HOST -> "127.0.0.1" + ) + val socketTimeout = 3000 // milliseconds + val socketBufferSize = 1024 + + def makeFixture(): (DatagramSocket, StatsdSink) = { --- End diff -- Since I'd like the fixtures (socket and sink) to be different for each test to avoid concurrency issues, before/after doesn't look like a fit. I can instead leverage [load fixture methods](http://www.scalatest.org/user_guide/sharing_fixtures#loanFixtureMethods) to reduce the duplicated calls to `makeFixture` and `socket.close()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78549/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18410: [SS][SPARK-20971] purge metadata log in FileStrea...
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/18410 [SS][SPARK-20971] purge metadata log in FileStreamSource ## What changes were proposed in this pull request? Currently, there is no cleanup mechanism for FileStreamSource's metadata log so that the data is growing infinitely This PR purges the log which is out of the retaining windowing ## How was this patch tested? existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/CodingCat/spark SPARK-20971 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18410.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18410 commit 9c21aca0ea24836259a5c94931ed9c7831607192 Author: CodingCat Date: 2016-03-07T14:37:37Z improve the doc for "spark.memory.offHeap.size" commit c82e3822751a461550906a6d78117633e7db4d1f Author: CodingCat Date: 2016-03-07T19:00:16Z fix commit b37ec112e01880e3d67d81972bae33487763c742 Author: Nan Zhu Date: 2017-06-23T22:28:31Z purge log in metadatalog of filesourcestream --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org