[GitHub] spark issue #15723: [SPARK-18214][SQL] Simplify RuntimeReplaceable type coer...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15723 **[Test build #67935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67935/consoleFull)** for PR 15723 at commit [`8cdf56c`](https://github.com/apache/spark/commit/8cdf56c1191a7fa3a4a567b92c11e1617d0f4f0e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15723: [SPARK-18214][SQL] Simplify RuntimeReplaceable ty...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/15723 [SPARK-18214][SQL] Simplify RuntimeReplaceable type coercion ## What changes were proposed in this pull request? RuntimeReplaceable is used to create aliases for expressions, but the way it deals with type coercion is pretty weird (each expression is responsible for how to handle type coercion, which does not obey the normal implicit type cast rules). This patch simplifies its handling by allowing the analyzer to traverse into the actual expression of a RuntimeReplaceable. ## How was this patch tested? - Correctness should be guaranteed by existing unit tests already - Removed SQLCompatibilityFunctionSuite and moved it sql-compatibility-functions.sql - Added a new test case in sql-compatibility-functions.sql for verifying explain behavior. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-18214 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15723.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15723 commit 8cdf56c1191a7fa3a4a567b92c11e1617d0f4f0e Author: Reynold XinDate: 2016-11-02T00:33:35Z [SPARK-18214][SQL] Simplify RuntimeReplaceable type coercion --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...
Github user pramitchoudhary commented on the issue: https://github.com/apache/spark/pull/9183 @yanboliang This is a very helpful initiative by you. Thanks for taking it up. Let me know, if you need any help for this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [WIP][SPARK-14914][CORE] Fix Resource not closed after u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15618 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15702: [SPARK-18124] Observed delay based Event Time Wat...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15702#discussion_r86053925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/EventTimeWatermark.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.plans.logical + +import org.apache.spark.sql.catalyst.expressions.{Attribute, Expression} +import org.apache.spark.sql.types.MetadataBuilder +import org.apache.spark.unsafe.types.CalendarInterval + +object EventTimeWatermark { + /** The [[org.apache.spark.sql.types.Metadata]] key used to hold the eventTime watermark delay. */ + val delayKey = "spark.watermarkDelay" +} + +/** + * Used to mark a user specified column as holding the event time for a row. + */ +case class EventTimeWatermark( +eventTime: Attribute, +delay: CalendarInterval, +child: LogicalPlan) extends LogicalPlan { + + // Update the metadata on the eventTime column to include the desired delay. + override val output: Seq[Attribute] = child.output.map { a => +if (a semanticEquals eventTime) { + val updatedMetadata = new MetadataBuilder() +.withMetadata(a.metadata) +.putLong(EventTimeWatermark.delayKey, delay.milliseconds) --- End diff -- Updating the key to include `Ms` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [WIP][SPARK-14914][CORE] Fix Resource not closed after u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15618 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67928/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [WIP][SPARK-14914][CORE] Fix Resource not closed after u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15618 **[Test build #67928 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67928/consoleFull)** for PR 15618 at commit [`1521572`](https://github.com/apache/spark/commit/15215722dfe2de0785d38cce5713f33fac5e4b03). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15700: [SPARK-17964][SparkR] Enable SparkR with Mesos cl...
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/15700#discussion_r86052116 --- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala --- @@ -84,7 +84,7 @@ private[spark] object RUtils { } } else { // Otherwise, assume the package is local - // TODO: support this for Mesos + // For Mesos, the path is also under SPARK_HOME. --- End diff -- I have removed this comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Here is my understanding: after spilling, it would call reset() to release the memory. In the reset() function, it deletes all the memory pages, but it did not release any memory from longArray(). So the longArray will keep growing. By freeing up the array and allocate only the initial size to it, we can ensure the longArray would not grow indefinitely. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13617 **[Test build #67934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67934/consoleFull)** for PR 13617 at commit [`be4c5ea`](https://github.com/apache/spark/commit/be4c5eab315ce3d567aed9c947531518b2b2e921). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15647: [SPARK-18088][ML] Various ChiSqSelector cleanups
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15647 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15703 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67930/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15703 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15703 **[Test build #67930 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67930/consoleFull)** for PR 15703 at commit [`6f28688`](https://github.com/apache/spark/commit/6f28688fd5b3ed1fd02a1082c121812cecbfe001). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15647: [SPARK-18088][ML] Various ChiSqSelector cleanups
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15647 I'll go ahead and merge this, but please comment if it needs any follow-ups. Merging with master Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong Do you have a theory why this will cause OOM? To me, the current code will use more memory than needed but less allocation, why it will cause OOM? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15698: [SPARK-18182] Expose ReplayListenerBus.read() ove...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15698 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67924/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15698: [SPARK-18182] Expose ReplayListenerBus.read() overload w...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15698 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15541 **[Test build #67924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67924/consoleFull)** for PR 15541 at commit [`b06de5e`](https://github.com/apache/spark/commit/b06de5ed0c285f01230f7182bf53676a9f6de74e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 This is production query. Sorry, I could not share it. It is doing a join between two big tables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13617 **[Test build #67933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67933/consoleFull)** for PR 13617 at commit [`f3a1193`](https://github.com/apache/spark/commit/f3a11932e152b1b8b3b7f8ce23a23c8decc3ba1d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15715: [SPARK-18198][Doc][Streaming] Highlight code snippets
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15715 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15703 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67927/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15703 **[Test build #67927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67927/consoleFull)** for PR 15703 at commit [`3af4f21`](https://github.com/apache/spark/commit/3af4f211ceaf749568fd9fb819c25c5d64892dac). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15703 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15722: [SPARK-18208] [Shuffle] Executor OOM due to a mem...
Github user jiexiong commented on a diff in the pull request: https://github.com/apache/spark/pull/15722#discussion_r86048313 --- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java --- @@ -903,11 +906,12 @@ public void reset() { numKeys = 0; numValues = 0; longArray.zeroOut(); - +freeArray(longArray); --- End diff -- Yeah, I will remove it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13617 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13617 **[Test build #67932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67932/consoleFull)** for PR 13617 at commit [`322f3bd`](https://github.com/apache/spark/commit/322f3bd86b0201a26359b307b1951b259898e8a4). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13617 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67932/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13617 **[Test build #67932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67932/consoleFull)** for PR 13617 at commit [`322f3bd`](https://github.com/apache/spark/commit/322f3bd86b0201a26359b307b1951b259898e8a4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14906: [SPARK-17350][SQL] Disable default use of KryoSer...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14906 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Here is the query: Here is the query: INSERT OVERWRITE TABLE lookalike_trainer_campaign_conv_users_with_country_shadow PARTITION(ds='2016-10-19') SELECT c.source_id, c.country, c.user_id, c.conversion_time FROM ( SELECT b.source_id, b.country, b.user_id, b.conversion_time, FB_NUMBER_ROWS(b.country, b.source_id) as rank FROM ( SELECT source_id, country, user_id, MAX(conversion_time) / 1000 AS conversion_time FROM ( SELECT v.campaigngroup_id, v.campaign_id, v.adgroup_id, v.user_id, Y.country, v.last_conversion_time AS conversion_time FROM dim_all_users_fast:bi Y JOIN lookalike_trainer_campaign_conv_raw v ON v.user_id = Y.userid WHERE v.ds='2016-10-19' AND Y.ds = '2016-10-19' AND Y.country IS NOT NULL ) a LATERAL VIEW EXPLODE(ARRAY(campaigngroup_id, campaign_id, adgroup_id)) s AS source_id GROUP BY country, source_id, user_id DISTRIBUTE by country, source_id SORT BY country, source_id, conversion_time DESC ) b ) c WHERE rank <= 6 Before the fix, it would fail from OOM error. After the fix, the OOM error went away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15722: [SPARK-18208] [Shuffle] Executor OOM due to a mem...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/15722#discussion_r86046017 --- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java --- @@ -903,11 +906,12 @@ public void reset() { numKeys = 0; numValues = 0; longArray.zeroOut(); - +freeArray(longArray); --- End diff -- You could also remove the zeroOut() since that is not relevant anymore --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 OK to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong I don't think this is a memory leak, BytesToBytesMap does not release all memory for each spilling based on the assumption that the memory will be acquired back soon. What's the query that make you think this is a leak? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14906: [SPARK-17350][SQL] Disable default use of KryoSerializer...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14906 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15027: [SPARK-17475] [STREAMING] Delete CRC files if the filesy...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15027 **[Test build #67931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67931/consoleFull)** for PR 15027 at commit [`9ff89c0`](https://github.com/apache/spark/commit/9ff89c0228c09764fa6444528050a35e823db0e6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15027: [SPARK-17475] [STREAMING] Delete CRC files if the filesy...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15027 It turns out there is a bug in Hadoop's FileContext that doesn't rename the checksum file. LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a memory lea...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67921/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15704 **[Test build #67921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67921/consoleFull)** for PR 15704 at commit [`72084a0`](https://github.com/apache/spark/commit/72084a00e4d6da2b2d0ef97cbef22f23f93cbb46). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15722: [SPARK-18208] [Shuffle] Executor OOM due to a mem...
GitHub user jiexiong opened a pull request: https://github.com/apache/spark/pull/15722 [SPARK-18208] [Shuffle] Executor OOM due to a memory leak in BytesToBytesMap ## What changes were proposed in this pull request? Fixed the OOM problem in BytesToBytesMap. ## How was this patch tested? build the package and submit a query to spark-shell with the new package. ./bin/fb-spark-shell.sh -q ad_metrics -t -- --conf spark.dynamicAllocation.maxEx ecutors=500 --conf spark.sql.shuffle.partitions=4000 History server UI location is https://fburl.com/484871973 Scuba dataset link for task metrics is https://fburl.com/484871984 Scuba dataset link for resource metrics is https://fburl.com/484871989 Scuba data link for profiling is https://fburl.com/484871993 Resource Manager Session link is https://our.intern.facebook.com/intern/bumblebee_proxy/?proxy_url=http://hadoop2016.atn1.facebook.com:8810/resourceManager/session.html?sessionId=12650700179 Spark context Web UI available at http://[2401:db00:11:d0af:face:0:19:0]:8087 You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiexiong/spark jie_oom_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15722 commit 8fa8fa4bca355ac8f803132156fd3926d68190f8 Author: Jie XiongDate: 2016-11-01T20:43:19Z Fix the OOM failure from this operator. commit 80fd5b394b0c9b7673c8a8a4eed65bc54afb7205 Author: Jie Xiong Date: 2016-11-01T22:48:59Z Merge remote-tracking branch 'upstream/master' into jie_oom_fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15703 **[Test build #67930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67930/consoleFull)** for PR 15703 at commit [`6f28688`](https://github.com/apache/spark/commit/6f28688fd5b3ed1fd02a1082c121812cecbfe001). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15718: [SPARK-16839][SQL] Simplify Struct creation code path
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15718 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15718: [SPARK-16839][SQL] Simplify Struct creation code path
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15718 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67918/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13617 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13617 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67929/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13617: [SPARK-10409] [ML] Add Multilayer Perceptron Regression ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13617 **[Test build #67929 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67929/consoleFull)** for PR 13617 at commit [`a5d9972`](https://github.com/apache/spark/commit/a5d9972da6b8002109d1fa611647fb39b3596bec). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15718: [SPARK-16839][SQL] Simplify Struct creation code path
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15718 **[Test build #67918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67918/consoleFull)** for PR 15718 at commit [`29ccf4e`](https://github.com/apache/spark/commit/29ccf4edaea3e6ba640bf688ecad976818639696). * This patch passes all tests. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67919/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15704 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15704 **[Test build #67919 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67919/consoleFull)** for PR 15704 at commit [`05c83fa`](https://github.com/apache/spark/commit/05c83fa3305fb0193041dbbcb97113a8c5bbfcb2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15697: [SparkR][Test]:remove unnecessary suppressWarnings
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15697 Ah, yes actually that's a problem now unfortunately. It seems even comitters can't retrigger. Actually, that's why I ran this via another account and left the comment above. I am trying to find a better way and will probably open a INFRA jira. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15697: [SparkR][Test]:remove unnecessary suppressWarnings
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15697 No, I just launched Jenkins test. I don't know how to manually trigger the AppVeyor test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15673: [SPARK-17992][SQL] Return all partitions from HiveShim w...
Github user mallman commented on the issue: https://github.com/apache/spark/pull/15673 @ericl I can do that, yes. I'm current tied down. I will push a new commit later today or tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15703 **[Test build #67927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67927/consoleFull)** for PR 15703 at commit [`3af4f21`](https://github.com/apache/spark/commit/3af4f211ceaf749568fd9fb819c25c5d64892dac). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [WIP][SPARK-14914][CORE] Fix Resource not closed after u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15618 **[Test build #67928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67928/consoleFull)** for PR 15618 at commit [`1521572`](https://github.com/apache/spark/commit/15215722dfe2de0785d38cce5713f33fac5e4b03). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [SPARK-14914][CORE] Fix Resource not closed after using,...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15618 Oh wait, it seems the failed test is related. Will take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedImperativ...
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15703 It turned out that I didn't initialize Hive UDAF evaluators properly. Quoted from commit message of my previous commit: > Hive UDAFs are sensitive to aggregation mode, and must be initialized with proper modes before being used. Basically, it means that you can't use an evaluator initialized with mode `PARTIAL1` to merge two aggregation states (although it still works for aggregate functions whose partial result type is the same as the final result type). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15618: [SPARK-14914][CORE] Fix Resource not closed after...
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/15618 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15618: [SPARK-14914][CORE] Fix Resource not closed after...
GitHub user HyukjinKwon reopened a pull request: https://github.com/apache/spark/pull/15618 [SPARK-14914][CORE] Fix Resource not closed after using, mostly for unit tests ## What changes were proposed in this pull request? Close `FileStreams`, `ZipFiles` etc to release the resources after using. Not closing the resources will cause IO Exception to be raised while deleting temp files. ## How was this patch tested? Existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-14914-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15618.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15618 commit ceb1681620b83e46425d20f20ff5a228c0a8d75b Author: U-FAREAST\tlDate: 2016-04-25T03:37:06Z mvn.cmd for windows Helping script for windows to download dependency and start zinc to support incremental building on windows. commit 98ee4ab6a96b3d8ad8acd8dbeb6dedc10523546e Author: U-FAREAST\tl Date: 2016-04-25T06:50:59Z Fix file not closed on FileSuite commit ed52ec72fee774e9b8a1bec3183ae4594d758b9f Author: U-FAREAST\tl Date: 2016-04-25T07:45:39Z Fix another file closing commit 448508f467c0401f5461d79dadc1454b03210368 Author: Tao LI Date: 2016-04-25T07:57:42Z close log data. commit 34078a3facde46f09430baf1a1a5976b1c2d2869 Author: Tao LI Date: 2016-04-25T08:01:55Z Close the class loader commit efb7227518e9fdb8c1a35ae2adb3971c9cfc1ac2 Author: Tao LI Date: 2016-04-25T08:20:48Z Another file not closed. commit a06bffc02e4d2cd7e723e73102c160c1c57f0915 Author: Tao LI Date: 2016-04-25T08:29:16Z Stop to release resources. commit 45262dcc58073a99417b9d1a6c0e24c393716c8f Author: U-FAREAST\tl Date: 2016-04-25T08:34:45Z More closing problem commit b3c0c96fb4cbe19344cc220e234c98644aa0efcd Author: U-FAREAST\tl Date: 2016-04-25T13:25:42Z Fix the zip file and jar file in RPackageUtilsSuite commit a176adbadfa68cb0819a1e958129c4d96b42b42c Author: U-FAREAST\tl Date: 2016-04-26T05:35:35Z Stop ssc in MasterFailureTest commit 35aacd29c4667550a4f870ee521ed185c5f9800c Author: U-FAREAST\tl Date: 2016-04-26T08:09:04Z Remove accidentally added files commit 9f50128da0660fed97d64b8a5e0d63285dbf93d5 Author: U-FAREAST\tl Date: 2016-05-03T06:11:12Z Code cleanup with respect to comments commit 55b360e276968eecea970267b0fa438b56e5e703 Author: U-FAREAST\tl Date: 2016-05-05T03:51:50Z Style fixes commit 91f82b5fac48afefeceadd764ff0e7b61944d875 Author: U-FAREAST\tl Date: 2016-05-05T07:29:32Z Minor code cleanup commit f3713d1ff59e5bd45a8a207eaf36ab8e6c285812 Author: hyukjinkwon Date: 2016-10-25T05:44:15Z ex -> e and indentation commit 863ea7f66d4919a3e2c8ee6e3ca575a80a3115dc Author: hyukjinkwon Date: 2016-10-27T15:55:39Z Use Utils.tryWithSafeFinally where possible. commit 3949dbeda632e215c64dcc089365bdc7334dacaf Author: hyukjinkwon Date: 2016-10-29T02:02:43Z close loader later commit 49cb4e7f259ba0a236b0a977e69719cfc165c265 Author: hyukjinkwon Date: 2016-10-30T12:07:32Z Initialize receivedBlockTracker in start() commit 15215722dfe2de0785d38cce5713f33fac5e4b03 Author: hyukjinkwon Date: 2016-11-01T15:43:39Z Require nulls for eventpoint and the tracker in start and remove other checks in methods --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [SPARK-14914][CORE] Fix Resource not closed after using,...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15618 Ll --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15618: [SPARK-14914][CORE] Fix Resource not closed after using,...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15618 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15697: [SparkR][Test]:remove unnecessary suppressWarnings
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15697 Oh, AppVeyor test is not triggered again with the Jebkins one. Or, do you mean you launched another build via another account and manually tested it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15553: [SPARK-18008] [build] Add support for -Dmaven.test.skip=...
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/15553 Closing PR given pushback to commit'ing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15553: [SPARK-18008] [build] Add support for -Dmaven.tes...
Github user mridulm closed the pull request at: https://github.com/apache/spark/pull/15553 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15698: [SPARK-18182] Expose ReplayListenerBus.read() overload w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15698 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67917/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15698: [SPARK-18182] Expose ReplayListenerBus.read() overload w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15698 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15659 I think we should try and get so that it can be a part of the 2.1 release so we can have publishing to PyPI added in 2.1.1 or 2.2. We've been looking at making PySpark installable with pip since April of 2014 in one form or another and from the discussions I think its pretty clear this could make a big difference in Spark adoption in the Python community. The most obvious non-additive change is changing how the scripts resolve SPARK_HOME, but I think if were extra careful around that it would be ok to merge to 2.1 after the initial branch is cut. That being said of course as the author of the most recent iteration of this I've got my own biases at play. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15354 Thank you for merging this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15659: [SPARK-1267][SPARK-18129] Allow PySpark to be pip instal...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15659 **[Test build #67926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67926/consoleFull)** for PR 15659 at commit [`1cdcf61`](https://github.com/apache/spark/commit/1cdcf6102f38b14dcd5c9f754241f896b63f32a2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15647: [SPARK-18088][ML] Various ChiSqSelector cleanups
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15647 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15647: [SPARK-18088][ML] Various ChiSqSelector cleanups
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15647 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67923/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15647: [SPARK-18088][ML] Various ChiSqSelector cleanups
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15647 **[Test build #67923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67923/consoleFull)** for PR 15647 at commit [`7d3c74c`](https://github.com/apache/spark/commit/7d3c74c1c4867a2dd7b3f999e378cdf3fd3453bb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15718: [SPARK-16839][SQL] Simplify Struct creation code path
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15718 **[Test build #67925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67925/consoleFull)** for PR 15718 at commit [`c0263d7`](https://github.com/apache/spark/commit/c0263d7cc136d7c00455fb74748755ffc5eda8ce). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15721: [SPARK-17772][ML][TEST] Add test functions for ML sample...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15721 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15721: [SPARK-17772][ML][TEST] Add test functions for ML sample...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15721 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67922/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15721: [SPARK-17772][ML][TEST] Add test functions for ML sample...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15721 **[Test build #67922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67922/consoleFull)** for PR 15721 at commit [`e10be45`](https://github.com/apache/spark/commit/e10be455ee943230a96e57370b718683647e6f03). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15719: [SPARK-18114][HOTFIX] Fix line-too-long style error from...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15719 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67914/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15719: [SPARK-18114][HOTFIX] Fix line-too-long style error from...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15719 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15719: [SPARK-18114][HOTFIX] Fix line-too-long style error from...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15719 **[Test build #67914 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67914/consoleFull)** for PR 15719 at commit [`c56011e`](https://github.com/apache/spark/commit/c56011e907dc132353f7166b229e010b18f34e8e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14435: [SPARK-16756][SQL][WIP] Add `sql` function to LogicalPla...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14435 I close this PR since the issue is closed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14435: [SPARK-16756][SQL][WIP] Add `sql` function to Log...
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/14435 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15637: [SPARK-18000] [SQL] Aggregation function for computing e...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15637 BTW, please follow the PR https://github.com/apache/spark/issues/15677 to rewrite your function description. Add an example for the new function if possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86031611 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions.aggregate + +import java.nio.ByteBuffer + +import scala.collection.immutable.TreeMap +import scala.collection.mutable + +import com.google.common.primitives.{Doubles, Ints, Longs} + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.{Expression, ExpressionDescription} +import org.apache.spark.sql.catalyst.util.ArrayBasedMapData +import org.apache.spark.sql.types.{DataType, _} +import org.apache.spark.unsafe.types.UTF8String + +/** + * The MapAggregate function for a column returns: + * 1. null if no non-null value exists. --- End diff -- I see. Then, we would better to explicitly explain it. `Returns null if the result set is empty or all values are null`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15675: [SPARK-18144][SQL] logging StreamingQueryListener...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15675#discussion_r86030559 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -290,7 +290,10 @@ class StreamingQuerySuite extends StreamTest with BeforeAndAfter with Logging { // A StreamingQueryListener that gets the query status after the first completed trigger val listener = new StreamingQueryListener { @volatile var firstStatus: StreamingQueryStatus = null - override def onQueryStarted(queryStarted: QueryStartedEvent): Unit = { } --- End diff -- nit: please add `@volatile` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15675: [SPARK-18144][SQL] logging StreamingQueryListener...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15675#discussion_r86030545 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingQueryListenerBus.scala --- @@ -50,7 +52,14 @@ class StreamingQueryListenerBus(sparkListenerBus: LiveListenerBus) override def onOtherEvent(event: SparkListenerEvent): Unit = { event match { case e: StreamingQueryListener.Event => -postToAll(e) +// SPARK-18144: we broadcast QueryStartedEvent to all listeners attached to this bus +// synchronously and to listeners attached to LiveListenerBus asynchronously. Therefore, +// we need to ignore QueryStartedEvent if this method is called within SparkListenerBus +// thread +if (Thread.currentThread().getName != "SparkListenerBus" || --- End diff -- nit: please use `!LiveListenerBus.withinListenerThread.value` instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15671: [SPARK-14567][ML]Add instrumentation logs to ML training...
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15671 So, if we consistently use `instr.logParams(params: _*)` (even in cases where it's acceptable) then we run the risk of adding some param in the future that could "overload" the logs (like initialModel). However, if we manually select the appropriate params to log, then we risk adding some other param in the future which we do want to log, but it never gets added. Both could be problematic. For now, I think I lean towards manually selecting which params to log rather than logging all params. If we add more params later we will have to remember to add them to the logging. What are others' thoughts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86030388 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions.aggregate + +import java.nio.ByteBuffer + +import scala.collection.immutable.TreeMap +import scala.collection.mutable + +import com.google.common.primitives.{Doubles, Ints, Longs} + +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, TypeCheckSuccess} +import org.apache.spark.sql.catalyst.expressions.{Expression, ExpressionDescription} +import org.apache.spark.sql.catalyst.util.ArrayBasedMapData +import org.apache.spark.sql.types.{DataType, _} +import org.apache.spark.unsafe.types.UTF8String + +/** + * The MapAggregate function for a column returns: + * 1. null if no non-null value exists. + * 2. (distinct non-null value, frequency) pairs of equi-width histogram when the number of + * distinct non-null values is less than or equal to the specified maximum number of bins. + * 3. an empty map otherwise. + * + * @param child child expression that can produce column value with `child.eval(inputRow)` + * @param numBinsExpression The maximum number of bins. + */ +@ExpressionDescription( + usage = +""" + _FUNC_(col, numBins) - Returns 1. null if no non-null value exists. + 2. (distinct non-null value, frequency) pairs of equi-width histogram when the number of + distinct non-null values is less than or equal to the specified maximum number of bins. + 3. an empty map otherwise. +""") +case class MapAggregate( +child: Expression, +numBinsExpression: Expression, +override val mutableAggBufferOffset: Int, +override val inputAggBufferOffset: Int) extends TypedImperativeAggregate[MapDigest] { + + def this(child: Expression, numBinsExpression: Expression) = { +this(child, numBinsExpression, 0, 0) + } + + // Mark as lazy so that numBinsExpression is not evaluated during tree transformation. + private lazy val numBins: Int = numBinsExpression.eval().asInstanceOf[Int] + + override def inputTypes: Seq[AbstractDataType] = { +Seq(TypeCollection(NumericType, TimestampType, DateType, StringType), IntegerType) --- End diff -- In the use cases of CBO, YES. However, this function becomes a general one. It could be also used/called by external users. Then, it might not make sense for this limit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15700: [SPARK-17964][SparkR] Enable SparkR with Mesos client mo...
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/15700 @srowen LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13880: SPARK-16178: Remove unnecessary Hive partition ch...
Github user rdblue closed the pull request at: https://github.com/apache/spark/pull/13880 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14906: [SPARK-17350][SQL] Disable default use of KryoSerializer...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14906 **[Test build #3390 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3390/consoleFull)** for PR 14906 at commit [`aa18bb6`](https://github.com/apache/spark/commit/aa18bb69f4ec60afffe2e5dc3c3bc6ac860b7821). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15088: SPARK-17532: Add lock debugging info to thread dumps.
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/15088 @srowen, if you have a chance, could you look at this again? I think it will be helpful for tracking down live-lock issues. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15700: [SPARK-17964][SparkR] Enable SparkR with Mesos client mo...
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/15700 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15700: [SPARK-17964][SparkR] Enable SparkR with Mesos cl...
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/15700#discussion_r86029381 --- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala --- @@ -84,7 +84,7 @@ private[spark] object RUtils { } } else { // Otherwise, assume the package is local - // TODO: support this for Mesos + // For Mesos, the path is also under SPARK_HOME. --- End diff -- This comment seems unnecessary now. Maybe remove it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost
Github user vlad17 commented on the issue: https://github.com/apache/spark/pull/14547 @jkbradley There seems to be more issues with deprecating impurity: [error] [warn] /home/jenkins/workspace/SparkPullRequestBuilder/mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala:114: method setImpurity overrides concrete, non-deprecated symbol(s):setImpurity [error] [warn] override def setImpurity(value: String): this.type = super.setImpurity(value) [error] [warn] [error] [warn] /home/jenkins/workspace/SparkPullRequestBuilder/mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala:111: method setImpurity overrides concrete, non-deprecated symbol(s):setImpurity [error] [warn] override def setImpurity(value: String): this.type = super.setImpurity(value) [error] [warn] The shared superclass for GBT* (Tree*Params) can't have setImpurity deprecated because it's shared with derived classes that should allow impurity-setting, and therefore can't have the base class method deprecated. I find it weird that a derived class can't add a deprecation, though. Why is that rule there? Can I disable it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15720: [SPARK-18167] Disable flaky SQLQuerySuite test
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15720 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org