[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14081 Double checking with @jkbradley that the example removals look OK? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14137: SPARK-16478 graphX (added graph caching in strongly conn...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14137 Oh good point, it does get materialized after `cache()` already because `numVertices` will call `count`. That does mean there's more than one call to evaluate the RDD, and that quite changes things. I see why `sccWorkGraph` works that way. I'm still a little hazy on `sccGraph` because the same isn't true of it in the current code. I agree that the dependency on `finalVertices` is relevant. As you say, the underlying `sccWorkGraph`s are cached. I think there does end up being a problem because it still means all of the lineage of cached `sccWorkGraph` evaluate at once and try to cache. I think your change is probably the right thing, then. What we should really do is add `unpersist` calls in the right places for `sccGraph` and `sccWorkGraph`, which is tricky. That would fully optimize this. (No my point was that the variable `sccGraphCountVertices` itself doesn't do anything. You don't need to store this value. `.count` does something of course.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14141: [SPARK-16375] [Web UI] Fixed misassigned var: numComplet...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14141 Oops, great catch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14142: [SPARK-16439] Fix number formatting in SQL UI
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14142 LGTM. Actually all other instances of NumberFormat in the project omit grouping separators, for various reasons. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14116 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14116 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62146/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14116 **[Test build #62146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62146/consoleFull)** for PR 14116 at commit [`a8dc9f4`](https://github.com/apache/spark/commit/a8dc9f41427e20e83f915ec845cbde728bb4a8d8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14148 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14148 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62143/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14148 **[Test build #62143 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62143/consoleFull)** for PR 14148 at commit [`473b27d`](https://github.com/apache/spark/commit/473b27deeb49096ddd38f1b4d4ca03207aa9e025). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14148 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14148 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62144/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14148: [SPARK-16482] [SQL] Describe Table Command for Tables Re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14148 **[Test build #62144 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62144/consoleFull)** for PR 14148 at commit [`a05383c`](https://github.com/apache/spark/commit/a05383c8ff4483dacdf34070173b965ab6f7d4ca). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13704: [SPARK-15985][SQL] Eliminate redundant cast from ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r70386711 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules.RuleExecutor +import org.apache.spark.sql.types._ + +class SimplifyCastsSuite extends PlanTest { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = Batch("SimplifyCasts", FixedPoint(50), SimplifyCasts) :: Nil + } + + test("non-nullable to non-nullable array cast") { +val input = LocalRelation('a.array(ArrayType(IntegerType, false))) +val array_intPrimitive = Literal.create( + Seq(1, 2, 3, 4, 5), ArrayType(IntegerType, false)) +val plan = input.select(array_intPrimitive + .cast(ArrayType(IntegerType, false)).as('a)).analyze +val optimized = Optimize.execute(plan) +val expected = input.select(array_intPrimitive.as('a)).analyze +comparePlans(optimized, expected) + } + + test("non-nullable to nullable array cast") { +val input = LocalRelation('a.array(ArrayType(IntegerType, false))) +val array_intPrimitive = Literal.create( + Seq(1, 2, 3, 4, 5), ArrayType(IntegerType, false)) +val plan = input.select(array_intPrimitive + .cast(ArrayType(IntegerType, true)).as('a)).analyze +val optimized = Optimize.execute(plan) +val expected = input.select(array_intPrimitive.as('a)).analyze +comparePlans(optimized, expected) + } + + test("nullable to non-nullable array cast") { +val input = LocalRelation('a.array(ArrayType(IntegerType, true))) +val array_intNull = Literal.create( + Seq(1, 2, null, 4, 5), ArrayType(IntegerType, true)) +val plan = input.select(array_intNull + .cast(ArrayType(IntegerType, false)).as('a)).analyze +val optimized = Optimize.execute(plan) +assert(optimized.resolved === false) --- End diff -- we can check `comparePlans(optimized, plan)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/14065 Also thinking about one example to land this feature, I think Kafka might be one candidate, they also have delegation token based proposal [KIP-48](https://cwiki.apache.org/confluence/display/KAFKA/KIP-48+Delegation+token+support+for+Kafka). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/14065 @tgravescs and @vanzin , these days I did some code refactoring work on this patch. Here listed changes I did compared to previous code: 1. Change the interface `ServiceTokenProvider` to `ServiceCredentialProvider` with the main method changed to `obtainCredentials` as suggested in comments. Now since we're not limiting to tokens, so instead of obtaining tokens, here changed to obtain credentials. And the method `obtainCredentials` is defined as: ```scala def obtainCredentials(hadoopConf: Configuration, creds: Credentials): Option[Long] ``` Here the return value `Option[Long]` means the time of next renewal, return `Some(Long)` if this credential is renewable, otherwise returns `None`. Also remove several redundant methods like get token renewal interval and so on. 2. Change `ConfigurableTokenManager` to `ConfigurableCredentialManager` to manager all the credential providers. 3. Change the way to load credential providers to ServiceLoader as suggested in comments. 4. Change initialization way from singleton to normal way. 6. Change the mechanism of checking credentials in `AMDelegationRenewer` and `ExecutorDelegationTokenUpdate`. Since now we can get the time of next renewal, so we use this to decide when to wake up to check the new credentials. Please help to review, thanks a lot for your time and greatly appreciate your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13890 **[Test build #62150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62150/consoleFull)** for PR 13890 at commit [`61da040`](https://github.com/apache/spark/commit/61da04057866c16c1f55ae9ecc448042fecd57c9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13704 **[Test build #62152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62152/consoleFull)** for PR 13704 at commit [`8dd829a`](https://github.com/apache/spark/commit/8dd829a4922441cc09dee08b532e6b3c90780535). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13778 **[Test build #62151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62151/consoleFull)** for PR 13778 at commit [`6065364`](https://github.com/apache/spark/commit/6065364da697cd29f9b31179063e6cf604aa25ef). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13778 @cloud-fan Updated. Please take a look. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13890 will merge it once tests pass, thanks for working on it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13890: [SPARK-16189][SQL] Add ExternalRDD logical plan for inpu...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13890 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14090 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14090 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62147/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14090 **[Test build #62147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62147/consoleFull)** for PR 14090 at commit [`2af7243`](https://github.com/apache/spark/commit/2af724321e0d51aed64c84dd22741a7cc6067caf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14138: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14138 LGTM except some style comments. Thanks for working on it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14138: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14138#discussion_r70383800 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala --- @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.SharedSQLContext + +class MiscFunctionsSuite extends QueryTest with SharedSQLContext { + import testImplicits._ + + test("reflect and java_method") { +val df = Seq((1, "one")).toDF("a", "b") +checkAnswer( + df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)"), --- End diff -- we should use `classOf[ReflectClass]...` instead of hardcode the class name. And also test `java_method` to match the test name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62149/consoleFull)** for PR 14065 at commit [`60a275f`](https://github.com/apache/spark/commit/60a275f9487eacab8ce8c70d6917d3aebe16d131). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14138: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14138#discussion_r70383507 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflectionSuite.scala --- @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.TypeCheckFailure +import org.apache.spark.sql.types.{IntegerType, StringType} + +/** A static class for testing purpose. */ +object ReflectStaticClass { + def method1(): String = "m1" + def method2(v1: Int): String = "m" + v1 + def method3(v1: java.lang.Integer): String = "m" + v1 + def method4(v1: Int, v2: String): String = "m" + v1 + v2 +} + +/** A non-static class for testing purpose. */ +class ReflectDynamicClass { + def method1(): String = "m1" +} + +/** + * Test suite for [[CallMethodViaReflection]] and its companion object. + */ +class CallMethodViaReflectionSuite extends SparkFunSuite with ExpressionEvalHelper { + + import CallMethodViaReflection._ + + // Get rid of the $ so we are getting the companion object's name. + private val staticClassName = ReflectStaticClass.getClass.getName.replace("$", "") --- End diff -- nit: we can use `stripSuffix("$")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14090 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14090 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62145/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14090 **[Test build #62145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62145/consoleFull)** for PR 14090 at commit [`c1d7151`](https://github.com/apache/spark/commit/c1d71512a3bf0205615d1b6318029ad6f33d94dc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13650: [SPARK-9623] [ML] Provide variance for RandomForestRegre...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13650 @MechCoder Sorry for late response. I will make a pass soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62148/consoleFull)** for PR 14065 at commit [`0fbf25b`](https://github.com/apache/spark/commit/0fbf25b51285ea433a3dcf733b92a1d785b3d017). * This patch **fails RAT tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14065 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62148/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14138: [SPARK-16284][SQL] Implement reflect SQL function
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14138#discussion_r70382966 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala --- @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.lang.reflect.{Method, Modifier} + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String +import org.apache.spark.util.Utils + +/** + * An expression that invokes a method on a class via reflection. + * + * For now, only types defined in `Reflect.typeMapping` are supported (basically primitives + * and string) as input types, and the output is turned automatically to a string. + * + * Note that unlike Hive's reflect function, this expression calls only static methods + * (i.e. does not support calling non-static methods). + * + * We should also look into how to consolidate this expression with + * [[org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke]] in the future. + * + * @param children the first element should be a literal string for the class name, + * and the second element should be a literal string for the method name, + * and the remaining are input arguments to the Java method. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with reflection", + extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n c33fb387-8500-4bfa-81d2-6e0e3e930df2") +// scalastyle:on line.size.limit +case class CallMethodViaReflection(children: Seq[Expression]) + extends Expression with CodegenFallback { + + override def prettyName: String = "reflect" + + override def checkInputDataTypes(): TypeCheckResult = { +if (children.size < 2) { + TypeCheckFailure("requires at least two arguments") +} else if (!children.take(2).forall(e => e.dataType == StringType && e.foldable)) { + // The first two arguments must be string type. + TypeCheckFailure("first two arguments should be string literals") +} else if (!classExists) { + TypeCheckFailure(s"class $className not found") +} else if (method == null) { + TypeCheckFailure(s"cannot find a static method that matches the argument types in $className") +} else { + TypeCheckSuccess +} + } + + override def deterministic: Boolean = false + override def nullable: Boolean = true + override val dataType: DataType = StringType + + override def eval(input: InternalRow): Any = { +var i = 0 +while (i < argExprs.length) { + buffer(i) = argExprs(i).eval(input).asInstanceOf[Object] + // Convert if necessary. Based on the types defined in typeMapping, string is the only + // type that needs conversion. If we support timestamps, dates, decimals, arrays, or maps + // in the future, proper conversion needs to happen here too. + if (buffer(i).isInstanceOf[UTF8String]) { +buffer(i) = buffer(i).toString + } + i += 1 +} +val ret = method.invoke(null, buffer : _*) +UTF8String.fromString(String.valueOf(ret)) + } + + @transient private lazy val argExprs: Array[Expression] = children.drop(2).toArray + + /** Name of the class -- this has to be called after we verify children has at least two exprs. */ + @transient private lazy val className = children(0).eval().asInstanceOf[UTF8String].toString --- End diff --
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14065 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14065: [SPARK-14743][YARN][WIP] Add a configurable token manage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14065 **[Test build #62148 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62148/consoleFull)** for PR 14065 at commit [`0fbf25b`](https://github.com/apache/spark/commit/0fbf25b51285ea433a3dcf733b92a1d785b3d017). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14090 **[Test build #62147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62147/consoleFull)** for PR 14090 at commit [`2af7243`](https://github.com/apache/spark/commit/2af724321e0d51aed64c84dd22741a7cc6067caf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13990: [SPARK-16287][SQL] Implement str_to_map SQL funct...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13990#discussion_r70382521 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -393,3 +394,89 @@ case class CreateNamedStructUnsafe(children: Seq[Expression]) extends Expression override def prettyName: String = "named_struct_unsafe" } + +/** + * Creates a map after splitting the input text into key/value pairs using delimeters + */ +@ExpressionDescription( + usage = "_FUNC_(text[, pairDelim, keyValueDelim]) - Creates a map after splitting the text " + +"into key/value pairs using delimiters. " + +"Default delimiters are ',' for pairDelim and ':' for keyValueDelim.", + extended = """ > SELECT _FUNC_('a:1,b:2,c:3',',',':');\n map("a":"1","b":"2","c":"3") """) +case class StringToMap(text: Expression, pairDelim: Expression, keyValueDelim: Expression) + extends TernaryExpression with ExpectsInputTypes { + + def this(child: Expression, pairDelim: Expression) = { +this(child, pairDelim, Literal(":")) + } + + def this(child: Expression) = { +this(child, Literal(","), Literal(":")) + } + + override def children: Seq[Expression] = Seq(text, pairDelim, keyValueDelim) + + override def inputTypes: Seq[AbstractDataType] = Seq(StringType, StringType, StringType) + + override def dataType: DataType = MapType(StringType, StringType, valueContainsNull = false) + + override def eval(input: InternalRow): Any = { +val exprs = children +val value1 = exprs(0).eval(input) +if (value1 != null) { + val value2 = exprs(1).eval(input) + if (value2 != null) { +val value3 = exprs(2).eval(input) +if (value3 != null) { + val array = value1.asInstanceOf[UTF8String] +.split(value2.asInstanceOf[UTF8String], -1) +.map { kv => + val arr = kv.split(value3.asInstanceOf[UTF8String], 2) + if(arr.length < 2) { +Array(arr(0), null) + } else { +arr + } +} + return ArrayBasedMapData(array.map(_(0)), array.map(_(1))) +} + } +} +throw new AnalysisException("All arguments should be a string literal.") --- End diff -- This should be done in `checkInputTypes`, not here inside `eval` at runtime --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14116 **[Test build #62146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62146/consoleFull)** for PR 14116 at commit [`a8dc9f4`](https://github.com/apache/spark/commit/a8dc9f41427e20e83f915ec845cbde728bb4a8d8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14036: [SPARK-16323] [SQL] Add IntegerDivide to avoid un...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14036#discussion_r70381897 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala --- @@ -237,6 +229,9 @@ case class Divide(left: Expression, right: Expression) } } + // Used by doGenCode + protected def divide(eval1: ExprCode, eval2: ExprCode, javaType: String): String --- End diff -- I don't think we need this abstraction. [this one](https://github.com/apache/spark/pull/14036/files#diff-1516b10738479bbe190fb4e239258473L252) already covers both fraction and integral --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user NarineK commented on the issue: https://github.com/apache/spark/pull/14090 Added data type description --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14090: [SPARK-16112][SparkR] Programming guide for gapply/gappl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14090 **[Test build #62145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62145/consoleFull)** for PR 14090 at commit [`c1d7151`](https://github.com/apache/spark/commit/c1d71512a3bf0205615d1b6318029ad6f33d94dc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13704#discussion_r70381491 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCastsSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules.RuleExecutor +import org.apache.spark.sql.types._ + +class SimplifyCastsSuite extends PlanTest { + + object Optimize extends RuleExecutor[LogicalPlan] { +val batches = Batch("SimplifyCasts", FixedPoint(50), SimplifyCasts) :: Nil + } + + test("non-nullable to non-nullable array cast") { +val input = LocalRelation('a.array(ArrayType(IntegerType, false))) +val array_intPrimitive = Literal.create( + Seq(1, 2, 3, 4, 5), ArrayType(IntegerType, false)) +val plan = input.select(array_intPrimitive --- End diff -- ah, i see. in `'a.array(dt)`, the `dt` is element type, so you are creating an array of array. However, the `array` method doesn't take `nullable`, maybe we should fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org