[GitHub] spark pull request: [SPARK-8185] [SPARK-8186] [SPARK-8187] [SQL] d...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6782#issuecomment-111384200 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8010][SQL]Promote types to StringType a...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/6551#issuecomment-111384430 ping... May AmplabJenkins test this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8185] [SPARK-8186] [SPARK-8187] [SQL] d...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6782#issuecomment-111384256 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...
Github user dibbhatt commented on the pull request: https://github.com/apache/spark/pull/6707#issuecomment-111383946 hi @tdas @zsxwing @harishreedharan is this PR okay with you ? Just a followup if there is anything needs to be done. I know you all must be super busy with 1.4 release .. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8185] [SPARK-8186] [SPARK-8187] [SQL] d...
GitHub user adrian-wang opened a pull request: https://github.com/apache/spark/pull/6782 [SPARK-8185] [SPARK-8186] [SPARK-8187] [SQL] datetime function: date_add, date_sub, datediff You can merge this pull request into a Git repository by running: $ git pull https://github.com/adrian-wang/spark udfdatecal Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6782.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6782 commit 90b5d2ea00e1646eef18c353f6f7243c6908e022 Author: Daoyuan Wang Date: 2015-06-12T06:55:03Z datetime function: date_add, date_sub, datediff --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7284][STREAMING] Updated streaming docu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6781#issuecomment-111383748 [Test build #34757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34757/consoleFull) for PR 6781 at commit [`a66ec22`](https://github.com/apache/spark/commit/a66ec22f37f6673fd8cca47746dfd434db4b773a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/6780#discussion_r32293746 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -39,11 +39,12 @@ object DefaultOptimizer extends Optimizer { Batch("Distinct", FixedPoint(100), ReplaceDistinctWithAggregate) :: Batch("Operator Optimizations", FixedPoint(100), - UnionPushdown, + UnionPushDown, CombineFilters, PushPredicateThroughProject, PushPredicateThroughGenerate, ColumnPruning, + LimitPushDown, --- End diff -- Actually I'm think of adding a test for the whole batch. As this batch is getting larger and larger, there are a lot more interactions between rules, we should make sure there is no conflict between rules and rules are all at right order. cc @marmbrus @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7284][STREAMING] Updated streaming docu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6781#issuecomment-111383097 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7284][STREAMING] Updated streaming docu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6781#issuecomment-111383086 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7284][STREAMING] Updated streaming docu...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/6781 [SPARK-7284][STREAMING] Updated streaming documentation - Kinesis API updated - Kafka version updated, and Python API for Direct Kafka added - Added SQLContext.getOrCreate() You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark SPARK-7284 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6781.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6781 commit a92ca39ea8dad8331fcf5cf6994d6770d47734d5 Author: Tathagata Das Date: 2015-06-12T06:45:20Z Updated streaming documentation commit a66ec22f37f6673fd8cca47746dfd434db4b773a Author: Tathagata Das Date: 2015-06-12T06:48:12Z Complete the line incomplete line, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/6780#discussion_r32293578 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -39,11 +39,12 @@ object DefaultOptimizer extends Optimizer { Batch("Distinct", FixedPoint(100), ReplaceDistinctWithAggregate) :: Batch("Operator Optimizations", FixedPoint(100), - UnionPushdown, + UnionPushDown, CombineFilters, PushPredicateThroughProject, PushPredicateThroughGenerate, ColumnPruning, + LimitPushDown, --- End diff -- `LimitPushDown` should run after `ColumnPruning`. For something like `Limit(Project(Sort(...)))`, we should try to push down `Project` through `Sort` first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111382473 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111382471 [Test build #34756 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34756/console) for PR 6779 at commit [`b7efa2a`](https://github.com/apache/spark/commit/b7efa2a7af28f016c53c80c34f7f773ba373ac8b). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Md5(child: Expression)` * ` case class SetInFilter[T <: Comparable[T]](` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6780#issuecomment-111382246 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6780#issuecomment-111382243 [Test build #34755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34755/console) for PR 6780 at commit [`d74c960`](https://github.com/apache/spark/commit/d74c9603564ed405ad71f137f948036eb4d0b5e7). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111382150 [Test build #34756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34756/consoleFull) for PR 6779 at commit [`b7efa2a`](https://github.com/apache/spark/commit/b7efa2a7af28f016c53c80c34f7f773ba373ac8b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6779#discussion_r32293459 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MiscFunctionsSuite.scala --- @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.dsl.expressions._ + +class MiscFunctionsSuite extends SparkFunSuite with ExpressionEvalHelper { + + test("md5") { +val s1 = 'a.string.at(0) +val s2 = 'a.binary.at(0) +checkEvaluation(Md5(s1), "902fbdd2b1df0c4f70b4a5d23525e932", create_row("ABC")) +checkEvaluation(Md5(s2), "6ac1e56bc78f031059be7be854522c4c", create_row(Array[Byte](1,2,3,4,5,6))) --- End diff -- this line will fail the style checker --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6780#issuecomment-111382057 [Test build #34755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34755/consoleFull) for PR 6780 at commit [`d74c960`](https://github.com/apache/spark/commit/d74c9603564ed405ad71f137f948036eb4d0b5e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6779#discussion_r32293430 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -1335,6 +1336,24 @@ object functions { def toRadians(columnName: String): Column = toRadians(Column(columnName)) // + // Misc functions + // + + /** + * Calculates an MD5 128-bit checksum for the string or binary --- End diff -- "Calculates the MD5 digest and returns the value as a 32 character hex string". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8323][CORE]Remove mapOutputTracker fiel...
Github user yufan-liu commented on the pull request: https://github.com/apache/spark/pull/6778#issuecomment-111381881 I just cannot figure out why we should keep the reference of mapOutputTracker field in the TaskSchedulerImpl. The mapOutputTracker can be referenced from a global entry point, which is SparkEnv. So obviously, the TaskSetManager should reference it directly from SparkEnv. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111381794 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111381770 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6780#issuecomment-111381781 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6780#issuecomment-111381757 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6779#discussion_r32293367 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.security.MessageDigest + +import org.apache.commons.codec.digest.DigestUtils +import org.apache.spark.sql.types.{BinaryType, StringType, DataType} +import org.apache.spark.unsafe.types.UTF8String + +/** + * A function that calculates an MD5 128-bit checksum for the string or binary. + * Defined for String and Binary types. + */ +case class Md5(child: Expression) + extends UnaryExpression with ExpectsInputTypes { + + override def dataType: DataType = StringType + + override def expectedChildTypes: Seq[DataType] = +if (child.dataType == BinaryType) Seq(BinaryType) else Seq(StringType) + + override def children: Seq[Expression] = child :: Nil + + override def eval(input: Row): Any = { +val value = child.eval(input) +if (value == null) { + null +} else if (child.dataType == BinaryType) { + UTF8String.fromString(DigestUtils.md5Hex(value.asInstanceOf[Array[Byte]])) +} else { + UTF8String.fromString(DigestUtils.md5Hex(value.asInstanceOf[UTF8String].getBytes)) +} + } + + override def toString: String = s"md5($child)" --- End diff -- upper case MD5 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6779#discussion_r32293314 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.expressions + +import java.security.MessageDigest + +import org.apache.commons.codec.digest.DigestUtils +import org.apache.spark.sql.types.{BinaryType, StringType, DataType} +import org.apache.spark.unsafe.types.UTF8String + +/** + * A function that calculates an MD5 128-bit checksum for the string or binary. + * Defined for String and Binary types. + */ +case class Md5(child: Expression) + extends UnaryExpression with ExpectsInputTypes { + + override def dataType: DataType = StringType + + override def expectedChildTypes: Seq[DataType] = +if (child.dataType == BinaryType) Seq(BinaryType) else Seq(StringType) --- End diff -- this doesn't make sense -- since it is using child's data type to check data type... you should remove ExpectsInputTypes and explicitly define the type check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7267][SPARK-7289] add LimitPushDown rul...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/6780 [SPARK-7267][SPARK-7289] add LimitPushDown rule This PR fix SPARK-7267 by push down limit, not lift it up, thus we can keep sort right beneath limit at some cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark limit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6780.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6780 commit d8ecb14783bea756165bff3e96530fb1d6347e37 Author: Wenchen Fan Date: 2015-06-12T02:23:12Z fix existing commit d74c9603564ed405ad71f137f948036eb4d0b5e7 Author: Wenchen Fan Date: 2015-06-12T06:06:13Z add LimitPushDown --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111380004 Jenkins, ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8323][CORE]Remove mapOutputTracker fiel...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6778#issuecomment-111379832 I don't think that meaningfully helps readability? I would not change it just to change it; this is too trivial compared to this overhead of reviewing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111379673 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/6779#issuecomment-111379265 @rxin @liancheng can you trigger the unit test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8323][CORE]Remove mapOutputTracker fiel...
Github user yufan-liu commented on the pull request: https://github.com/apache/spark/pull/6778#issuecomment-111378963 @srowen Easier to read the code, especially for those who started to learn the internal implementation of spark core component. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8218][SQL] Add binary log math function
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/6725#issuecomment-111378648 @rxin @marmbrus I think your comments are addressed now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7884] Move block deserialization from B...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6423#issuecomment-111378626 Sorry has to be this weekend. Still busy with some other stuff. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8234][SQL] misc function: md5
GitHub user qiansl127 opened a pull request: https://github.com/apache/spark/pull/6779 [SPARK-8234][SQL] misc function: md5 You can merge this pull request into a Git repository by running: $ git pull https://github.com/qiansl127/spark MD5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6779.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6779 commit b7efa2a7af28f016c53c80c34f7f773ba373ac8b Author: Shilei Date: 2015-06-12T06:20:17Z Add md5 function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Tiny Fix][Core]Remove mapOutputTracker field ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6778#issuecomment-111378373 That's probably OK but what does this fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111378083 [Test build #34754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34754/consoleFull) for PR 6737 at commit [`e9c35d7`](https://github.com/apache/spark/commit/e9c35d74f5f024db47b44762fdecade807948b05). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376870 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376849 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Tiny Fix][Core]Remove mapOutputTracker field ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6778#issuecomment-111376843 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376644 [Test build #34753 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34753/consoleFull) for PR 6737 at commit [`e9c35d7`](https://github.com/apache/spark/commit/e9c35d74f5f024db47b44762fdecade807948b05). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Remove mapOutputTracker in TaskSchedulerImpl
GitHub user yufan-liu opened a pull request: https://github.com/apache/spark/pull/6778 Remove mapOutputTracker in TaskSchedulerImpl Because TaskSchedulerImpl's mapOutputTracker field is only referenced once in TaskSetManager. I think we could remove the mapOutputTracker field in the TaskSchedulerImpl class. Instead, we could reference the mapOutputTracker from SparkEnv directly in TaskSetManager. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yufan-liu/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6778.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6778 commit b45f4d894cc2ad944bbe3c4b522b4a472121196c Author: åé°å¸ Date: 2015-06-12T06:17:28Z Remove mapOutputTracker reference The mapOutputTracker reference is used once only in the TaskSetManager. We could refer the mapOutputTracker directly from SparkEnv in the TaskSetManager instead of from TaskSchedulerImpl. commit 435a465d65ab33082d16cfc6b60ffeb919ecd450 Author: åé°å¸ Date: 2015-06-12T06:20:30Z Get mapOutputTracker from SparkEnv Get mapOutputTracker from SparkEnv directly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376427 It may be a problem with the pR builder --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376435 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376424 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111376417 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/6737#issuecomment-111375429 Can somebody help me with this test failure? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/6777#discussion_r32292344 --- Diff: ec2/spark_ec2.py --- @@ -70,7 +70,7 @@ "1.2.1", "1.3.0", "1.3.1", -"1.4.0" +"1.4.0", --- End diff -- I also do this when the language syntax allows, since it means that adding a next line will not show up as a change (because of adding the comma to the previous line) but just a 1-line addition. Really minor but has some tiny value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][minor] correct semanticEquals logic
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6261#issuecomment-111369508 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][minor] correct semanticEquals logic
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6261#issuecomment-111369504 [Test build #34747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34747/console) for PR 6261 at commit [`4daef88`](https://github.com/apache/spark/commit/4daef887931bd662901c1c9af24a8cb66286ce1b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user markmsmith commented on a diff in the pull request: https://github.com/apache/spark/pull/6777#discussion_r32292102 --- Diff: ec2/spark_ec2.py --- @@ -70,7 +70,7 @@ "1.2.1", "1.3.0", "1.3.1", -"1.4.0" +"1.4.0", --- End diff -- I was just copying the convention of what was there before in the master branch - both arrays had trailing commas, so I figured it was on purpose. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111369374 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111369425 [Test build #34752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34752/consoleFull) for PR 6775 at commit [`d7b01f8`](https://github.com/apache/spark/commit/d7b01f87041980eb9f65a4284936d22036f59c57). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111369386 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8319] [CORE] [SQL] Update logic related...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6773#issuecomment-111369141 [Test build #34751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34751/consoleFull) for PR 6773 at commit [`85a4628`](https://github.com/apache/spark/commit/85a46287e2264ff4d736a5feabe1b36becb519af). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8319] [CORE] [SQL] Update logic related...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6773#issuecomment-111368847 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8319] [CORE] [SQL] Update logic related...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6773#issuecomment-111368864 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8319] [CORE] [SQL] Update logic related...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6773#issuecomment-111368695 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6566][SQL]: Related changes for newer p...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/5889 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8057][Core]Call TaskAttemptContext.getT...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6599#issuecomment-111366789 @zsxwing Have you tested this change with a Hadoop1 cluster by any chance ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6566][SQL]: Related changes for newer p...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/5889#issuecomment-111366693 @saucam Sorry for the late reply. This LGTM now. The inefficient code path in Parquet still exists (sequentially retrieving `FileStatus`), but now it only affects client side metadata retrieving, which is deprecated. So I'm going to merge this to master. Thanks for working on this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/6777#discussion_r32291542 --- Diff: ec2/spark_ec2.py --- @@ -70,7 +70,7 @@ "1.2.1", "1.3.0", "1.3.1", -"1.4.0" +"1.4.0", --- End diff -- Not sure why you need the comma here ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111366443 [Test build #34750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34750/consoleFull) for PR 6776 at commit [`d744244`](https://github.com/apache/spark/commit/d744244d7fa5da4ffba13ff9a4c03caf15447a7b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user markmsmith commented on the pull request: https://github.com/apache/spark/pull/6777#issuecomment-111366367 @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111366230 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111366217 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user calvinjia commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111366156 @shivaram Yup, Spark 1.4 -> Tachyon 0.6.4. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6980] [CORE] [WIP] Akka timeout excepti...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/6205#discussion_r32291362 --- Diff: core/src/test/scala/org/apache/spark/rpc/akka/AkkaRpcEnvSuite.scala --- @@ -47,4 +56,60 @@ class AkkaRpcEnvSuite extends RpcEnvSuite { } } + test("timeout on ask Future with RpcTimeout") { + +class EchoActor(sleepDuration: Long) extends Actor { + def receive: Receive = { +case msg => + Thread.sleep(sleepDuration) + sender() ! msg + } +} + +val system = ActorSystem("EchoSystem") +val echoActor = system.actorOf(Props(new EchoActor(0)), name = "echo") +val sleepyActor = system.actorOf(Props(new EchoActor(50)), name = "sleepy") + +val shortProp = "spark.rpc.short.timeout" +val timeout = new RpcTimeout(10 millis, shortProp) + +try { + + // Ask with immediate response + var fut = echoActor.ask("hello")(timeout.duration).mapTo[String]. +recover(timeout.addMessageIfTimeout) + + // This should complete successfully + val result = timeout.awaitResult(fut) + + assert(result.nonEmpty) + + // Ask with delayed response + fut = sleepyActor.ask("goodbye")(timeout.duration).mapTo[String]. +recover(timeout.addMessageIfTimeout) + + // Allow future to complete with failure using plain Await.result, this will return + // once the future is complete + val msg1 = +intercept[RpcTimeoutException] { + Await.result(fut, 200 millis) +}.getMessage() + + assert(msg1.contains(shortProp)) + + // Use RpcTimeout.awaitResult to process Future, since it has already failed with + // RpcTimeoutException, the same exception should be thrown + val msg2 = +intercept[RpcTimeoutException] { + timeout.awaitResult(fut) +}.getMessage() --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user markmsmith commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365974 Good catch, I'll update and re-push. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365896 @markmsmith thanks for the PR. However I am not sure the tachyon mapping is correct. If I am reading it correctly it looks like 1.4 depends on 0.6.4 https://github.com/apache/spark/blob/branch-1.4/core/pom.xml#L293 cc @haoyuan @calvinjia who might know more about tachyon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365876 [Test build #34749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34749/consoleFull) for PR 6776 at commit [`e4f14d3`](https://github.com/apache/spark/commit/e4f14d395084819fdf0fb32a523c0662ab7230da). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6777#issuecomment-111365820 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111365825 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111365822 [Test build #34748 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34748/console) for PR 6775 at commit [`b1ac20b`](https://github.com/apache/spark/commit/b1ac20b43f5a4a8002bb0ee39a00d6050fc3d63a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ConcatWS(sep: Expression, child: Expression)` * `case class Concat(child: Expression*)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365835 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365823 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111365789 Jenkins, ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8218][SQL] Add binary log math function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6725#issuecomment-111365733 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
GitHub user markmsmith opened a pull request: https://github.com/apache/spark/pull/6777 [SPARK-8322][EC2] Added spark 1.4.0 into the VALID_SPARK_VERSIONS and⦠⦠SPARK_TACHYON_MAP You can merge this pull request into a Git repository by running: $ git pull https://github.com/markmsmith/spark branch-1.4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6777.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6777 commit 90d165528f03c9e34442bb1d5cadf22b28b38ae0 Author: Mark Smith Date: 2015-06-12T05:05:43Z [SPARK-8322][EC2] Added spark 1.4.0 into the VALID_SPARK_VERSIONS and SPARK_TACHYON_MAP --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8218][SQL] Add binary log math function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6725#issuecomment-111365725 [Test build #34745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34745/console) for PR 6725 at commit [`3d75bfc`](https://github.com/apache/spark/commit/3d75bfcc1d5e7ed1ee0a086280204a6ea11ac28e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Logarithm(left: Expression, right: Expression)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862][SQL]Fix the deadlock in script tr...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6404 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862][SQL]Fix the deadlock in script tr...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/6404#issuecomment-111365513 Thanks, merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8317] [SQL] Do not push sort into shuff...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6772 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8317] [SQL] Do not push sort into shuff...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/6772#issuecomment-111361987 Merging with master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111361943 [Test build #34748 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34748/consoleFull) for PR 6775 at commit [`b1ac20b`](https://github.com/apache/spark/commit/b1ac20b43f5a4a8002bb0ee39a00d6050fc3d63a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111360793 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-111360739 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6776#issuecomment-111360734 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8322][EC2] Added spark 1.4.0 into the V...
GitHub user markmsmith opened a pull request: https://github.com/apache/spark/pull/6776 [SPARK-8322][EC2] Added spark 1.4.0 into the VALID_SPARK_VERSIONS and⦠⦠SPARK_TACHYON_MAP This contribution is my original work and I license the work to the project under the project's open source license. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markmsmith/spark SPARK-8322 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6776.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6776 commit e4f14d395084819fdf0fb32a523c0662ab7230da Author: Mark Smith Date: 2015-06-12T05:05:43Z [SPARK-8322][EC2] Added spark 1.4.0 into the VALID_SPARK_VERSIONS and SPARK_TACHYON_MAP --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8309] [CORE] Support for more than 12M ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6763#issuecomment-111359746 Yes, 2^31 is not possible at all. There are caveats to the actual max array size, yes, but this is really an orthogonal issue. I think it's best to not assert about the size here at all, or just assert about a negative value on overflow. I don't think anything else can or should be done. The right value of `POSITION_MAX` is still 0x7FFF. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8314][MLlib] improvement in performance...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/6768#discussion_r32290545 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala --- @@ -161,10 +161,10 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext { } test("appendBias") { -val sv = Vectors.sparse(3, Seq((0, 1.0), (2, 3.0))) +val sv = Vectors.sparse(4, Seq((0, 1.0), (2, 3.0))) --- End diff -- okay, after thinking carefully, it seems it's not necessary. please revert the test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8129] [CORE] [SECURITY] Securely pass a...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6676#issuecomment-111359579 OK, I think you can close this if it's not active. It doesn't go away and can still be commented on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4362] [MLLIB] Make prediction probabili...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6761#issuecomment-111359266 [Test build #900 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/900/consoleFull) for PR 6761 at commit [`7f53d08`](https://github.com/apache/spark/commit/7f53d08b2cfd17353f1662417e3a6999cc3e3408). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8314][MLlib] improvement in performance...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/6768#discussion_r32290408 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala --- @@ -161,10 +161,10 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext { } test("appendBias") { -val sv = Vectors.sparse(3, Seq((0, 1.0), (2, 3.0))) +val sv = Vectors.sparse(4, Seq((0, 1.0), (2, 3.0))) --- End diff -- Since in the original test, the last element is non-zero, so ``sv.size == sv.values.length``. Just want to prevent this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4362] [MLLIB] Make prediction probabili...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6761#issuecomment-111359221 No, I triggered it manually. There may still be a problem with the PR builder --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8314][MLlib] improvement in performance...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6768#issuecomment-111359152 [Test build #899 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/899/consoleFull) for PR 6768 at commit [`e999d79`](https://github.com/apache/spark/commit/e999d79ad76426614cc7a2db65a068ed402d0216). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8314][MLlib] improvement in performance...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/6768#discussion_r32290358 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala --- @@ -161,10 +161,10 @@ class MLUtilsSuite extends SparkFunSuite with MLlibTestSparkContext { } test("appendBias") { -val sv = Vectors.sparse(3, Seq((0, 1.0), (2, 3.0))) +val sv = Vectors.sparse(4, Seq((0, 1.0), (2, 3.0))) --- End diff -- (Why change the test?) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8314][MLlib] improvement in performance...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/6768#discussion_r32290348 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala --- @@ -270,12 +270,28 @@ object MLUtils { * Returns a new vector with `1.0` (bias) appended to the input vector. */ def appendBias(vector: Vector): Vector = { -val vector1 = vector.toBreeze match { - case dv: BDV[Double] => BDV.vertcat(dv, new BDV[Double](Array(1.0))) - case sv: BSV[Double] => BSV.vertcat(sv, new BSV[Double](Array(0), Array(1.0), 1)) - case v: Any => throw new IllegalArgumentException("Do not support vector type " + v.getClass) +vector match { + case dv: DenseVector => +val inputValues = dv.values +val inputLength = inputValues.length +val outputValues = Array.ofDim[Double](inputLength + 1) +System.arraycopy(inputValues, 0, outputValues, 0, inputLength) +outputValues(inputLength) = 1.0 +Vectors.dense(outputValues) + case sv: SparseVector => +val inputValues = sv.values +val inputIndices = sv.indices +val inputValuesLength = inputValues.length +val dim = sv.size +val outputValues = Array.ofDim[Double](inputValuesLength + 1) +val outputIndices = Array.ofDim[Int](inputValuesLength + 1) +System.arraycopy(inputValues, 0, outputValues, 0, inputValuesLength) +System.arraycopy(inputIndices, 0, outputIndices, 0, inputValuesLength) +outputValues(inputValuesLength) = 1.0 +outputIndices(inputValuesLength) = dim +Vectors.sparse(dim + 1, outputIndices, outputValues) + case _ => throw new IllegalArgumentException("Do not support vector type " + vector.getClass) --- End diff -- Please use ```scala case _ => throw new IllegalArgumentException(s"Do not support vector type ${vector.getClass}") ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8316] Upgrade to Maven 3.3.3
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/6770#issuecomment-111358755 I believe we'd also need to require Maven 3.3 in the build, and in enforcer plugin? to actually fail if used with a local version that's not high enough? 3.3 is fairly new and most dev environments won't have it. I suppose that's why `build/mvn` exists too. It may still cause some surprise since most `mvn ...` would fail then. Not enforcing this risks people hitting the issue that prompted this, I suppose, without knowing. then again, otherwise, earlier versions of Maven appear fine. Hm. What do you guys think about the tradeoffs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6583][SQL] Support aggregated function ...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/5290#issuecomment-111357564 I think we are close. What about this: https://github.com/watermen/spark/pull/1/files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org