[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193585815 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52628/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193585535 **[Test build #52628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52628/consoleFull)** for PR 11569 at commit [`ac8f4c9`](https://github.com/apache/spark/commit/ac8f4c9ca56b592c32c60dc945023050df89bdb4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13600] [MLlib] [WIP] Incorrect number o...
Github user oliverpierson commented on the pull request: https://github.com/apache/spark/pull/11553#issuecomment-193584221 Putting this up for review now. Tests are passing on my machine. Using `approxQuantile` in DataFrame stats reduces amount of code required by a good bit. As for the default `relativeError` value, which is passed onto `approxQuantile`... perhaps @jkbradley has a suggestion? I basically chose 0.01 on whim, since I couldn't really make a compelling argument for any particular value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11565#issuecomment-193583045 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11565#issuecomment-193583046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52620/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11565#issuecomment-193582844 **[Test build #52620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52620/consoleFull)** for PR 11565 at commit [`467b095`](https://github.com/apache/spark/commit/467b095d89ce641f568aade09d710fb9ea573273). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193582705 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193582706 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52621/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193582028 **[Test build #52621 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52621/consoleFull)** for PR 11550 at commit [`c51d4ef`](https://github.com/apache/spark/commit/c51d4efbe72e4713a53ee7706996bef837d79fa5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193580343 **[Test build #52628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52628/consoleFull)** for PR 11569 at commit [`ac8f4c9`](https://github.com/apache/spark/commit/ac8f4c9ca56b592c32c60dc945023050df89bdb4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11565#discussion_r55311096 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -345,8 +343,6 @@ object ColumnPruning extends Rule[LogicalPlan] { // Prunes the unused columns from child of Aggregate/Window/Expand/Generate case a @ Aggregate(_, _, child) if (child.outputSet -- a.references).nonEmpty => a.copy(child = prunedChild(child, a.references)) -case w @ Window(_, _, _, _, child) if (child.outputSet -- w.references).nonEmpty => --- End diff -- Seems we even don't have any tests for this in `ColumnPruningSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193579525 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193579527 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52617/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193578999 **[Test build #52617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52617/consoleFull)** for PR 11235 at commit [`312cb32`](https://github.com/apache/spark/commit/312cb326922624e95528b7f2dc92129c59b3b524). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT-FIX][BUILD] Use the new location of `chec...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/11567#issuecomment-193578502 `PySpark` failure is irrelevant for this PR, but I rebased this PR to the master because this is still a problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT-FIX][BUILD] Use the new location of `chec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11567#issuecomment-193578669 **[Test build #52627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52627/consoleFull)** for PR 11567 at commit [`4a58fba`](https://github.com/apache/spark/commit/4a58fba530df6e4b665389804908d04da88e7d4f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193576069 @falaki Just let to know, I changed the name `CSVInferSchema` to `InferSchema` mainly for consistent names for CSV and JSON data source but maybe they might have to be `CSVInferSchema` and `JSONInferSchema`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193573893 **[Test build #52626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52626/consoleFull)** for PR 11555 at commit [`c82229a`](https://github.com/apache/spark/commit/c82229a42efec9131652435b9543df81d1feab6c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11336#issuecomment-193573753 I personally find it confusing having to reason about when we can "head"/"collect"/"show" and when we cannot, and that's why the Scala/Python version of the API didn't have this feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12719][SQL] SQL generation support for ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11563#issuecomment-193572743 "as we wanted to generate SQLs which is closer to the original SQL" Why is this a goal? I worry about the fragility of this two cases, if we really only need one to satisfy correctness. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11235#discussion_r55309414 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -144,6 +146,56 @@ object EliminateSerialization extends Rule[LogicalPlan] { } /** + * Add Filter to left and right of an inner Join to filter out rows with null keys. + * So we may not need to check nullability of keys while joining. Besides, by filtering + * out keys with null, we can also reduce data size in Join. + */ +object AddFilterOfNullForInnerJoin extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- Renamed. Will do part of semi join and outer join in separate PR once this getting merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193566395 It seems better to keep SparkR as a base package providing core functionalities, while visualization features can be implemented in other packages based on SparkR. There is an example at https://github.com/PAPL-SKKU/ggplot2.SparkR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12719][SQL] SQL generation support for ...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/11563#discussion_r55309116 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/LogicalPlanToSQLSuite.scala --- @@ -445,4 +461,86 @@ class LogicalPlanToSQLSuite extends SQLBuilderTest with SQLTestUtils { "f1", "b[0].f1", "f1", "c[foo]", "d[0]" ) } + + test("SQL generation for generate") { --- End diff -- @rxin I have split the tests into 5 groups. Pl. let me know if it looks ok to you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13738][SQL] Cleanup Data Source resolut...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11572#issuecomment-193565461 **[Test build #52625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52625/consoleFull)** for PR 11572 at commit [`cf7c719`](https://github.com/apache/spark/commit/cf7c719b72896450affad9b866ad9077a6140e40). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562698 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10387][ML] Add code gen for gbt
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/9524#discussion_r55308566 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/codeGenerator.scala --- @@ -0,0 +1,156 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.ml.tree + +import org.codehaus.janino.ClassBodyEvaluator + +import org.apache.spark.Logging +import org.apache.spark.mllib.linalg.{Vector, Vectors} + +/** + * An object for creating a code generated decision tree model. + * NodeToTree is used to convert a node to a series if code gen + * if/else statements conditions returning the predicition for a + * given vector. + * getScorer wraps this and provides a function we can use to get + * the prediction. + */ +private[spark] object CodeGenerationDecisionTreeModel extends Logging { + private val prefix = "mllibCodeGen" + private val curId = new java.util.concurrent.atomic.AtomicInteger() + + /** + * Compile the Java source code into a Java class, using Janino. + * Based on Spark SQL's implementation. This should be moved to a common class + * once we have multiple code generators in ML. + * + * It will track the time used to compile + */ + protected def compile(code: String, implements: Array[Class[_]]): Class[_] = { +val startTime = System.nanoTime() +val evaluator = new ClassBodyEvaluator() +val clName = freshName() +evaluator.setParentClassLoader(getClass.getClassLoader) +evaluator.setImplementedInterfaces(implements) +evaluator.setClassName(clName) +evaluator.setDefaultImports(Array( + "org.apache.spark.mllib.linalg.Vectors", + "org.apache.spark.mllib.linalg.Vector" +)) +evaluator.cook(s"${clName}.java", code) +val clazz = evaluator.getClazz() +val endTime = System.nanoTime() +def timeMs: Double = (endTime - startTime).toDouble / 100 +logDebug(s"Compiled Java code (${code.size} bytes) in $timeMs ms") +clazz + } + + protected def freshName(): String = { +s"$prefix${curId.getAndIncrement}" + } + + + /** + * Convert the tree starting at the provided root node into a code generated + * series of if/else statements. If the tree is too large to fit in a single + * in-line method breaks it up into multiple methods. + * Returns a string for the current function body and a string of any additional + * functions. + */ + def nodeToTree(root: Node, depth: Int): (String, String) = { +// Handle the different types of nodes +root match { + case node: InternalNode => { +// Handle trees that get too large to fit in a single in-line java method +depth match { + case 8 => { +val newFunctionName = freshName() +val newFunction = nodeToFunction(root, newFunctionName) +(s"return ${newFunctionName}(input);", newFunction) + } + case _ => { +val nodeSplit = node.split +val (leftSubCode, leftSubFunction) = nodeToTree(node.leftChild, depth + 1) +val (rightSubCode, rightSubFunction) = nodeToTree(node.rightChild, depth + 1) +val subCode = nodeSplit match { + case split: CategoricalSplit => { +val isLeft = split.isLeft +isLeft match { + case true => s""" + if (categories.contains(input.apply(${split.featureIndex}))) { --- End diff -- Sounds like a good idea, I'll take a look at in-lining this for small sets. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562700 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52616/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user mwws closed the pull request at: https://github.com/apache/spark/pull/11571 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193562358 **[Test build #52616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13738][SQL] Cleanup Data Source resolut...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11572#issuecomment-193562059 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52624/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13738][SQL] Cleanup Data Source resolut...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11572#issuecomment-193562058 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user mwws commented on the pull request: https://github.com/apache/spark/pull/11571#issuecomment-193562023 OK, thanks for the explanation, I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13738][SQL] Cleanup Data Source resolut...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11572#issuecomment-193562054 **[Test build #52624 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52624/consoleFull)** for PR 11572 at commit [`36969f8`](https://github.com/apache/spark/commit/36969f8671ed396e9ed2027b8ee2c7435bbf7dfc). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13738][SQL] Cleanup Data Source resolut...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11572#issuecomment-193561759 **[Test build #52624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52624/consoleFull)** for PR 11572 at commit [`36969f8`](https://github.com/apache/spark/commit/36969f8671ed396e9ed2027b8ee2c7435bbf7dfc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/11336#issuecomment-193561063 A column can be applied to different dataframes. For example, if both df1 and df2 have a column named "col",then col <- column("col") collect(select(df1, col)) collect(select(df2, col)) both works. Take the join case above as example, You have can different DataFrames resulting from different joins on both df1 and df2, and apply c3 to the different resulting DataFrames also work. So how do you know which dataFrame to associate with a column in such cases? @rxin, any comments on this issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13695] Don't cache MEMORY_AND_DISK bloc...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11533#issuecomment-193560625 **[Test build #2615 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2615/consoleFull)** for PR 11533 at commit [`8f332a7`](https://github.com/apache/spark/commit/8f332a7c14aff8aebfd8b36ec56fa33b8330605e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13664][SQL] Cleanup Data Source resolut...
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/11572 [SPARK-13664][SQL] Cleanup Data Source resolution Follow-up to #11509, that simply refactors the interface that we use when resolving a pluggable `DataSource`. - Multiple functions share the same set of arguments so we make this a case class `DataSource`. Actual resolution is now done by calling a function. - Instead of having multiple methods named `apply` (some of which do writing some of which do reading) we now explicitly have `resolveRelation(...)` and `write(...)`. - Get rid of `Array[String]` since this is an internal API and was forcing us to call `toArray` in a bunch of places. You can merge this pull request into a Git repository by running: $ git pull https://github.com/marmbrus/spark dataSourceResolution Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11572.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11572 commit 36969f8671ed396e9ed2027b8ee2c7435bbf7dfc Author: Michael ArmbrustDate: 2016-03-08T02:18:44Z [SPARK-13664][SQL] Cleanup Data Source resolution --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11571#issuecomment-193560334 But this can only be added to 2.0 (we won't be able to change an existing release). If users already need to change the constructor in order to use it, why don't they just create a SQLContext/SparkSession? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/11301#discussion_r55307944 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala --- @@ -50,7 +50,7 @@ object ExpressionSet { class ExpressionSet protected( protected val baseSet: mutable.Set[Expression] = new mutable.HashSet, protected val originals: mutable.Buffer[Expression] = new ArrayBuffer) - extends Set[Expression] { + extends Set[Expression] with Serializable { --- End diff -- Yes, I got an exception regarding non-serializable in test suites in ```hive``` when ```ExpressionSet``` is not ```Serializable```. This is why I added ```Serialiable``` to ```ExpressionSet``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13527] [SQL] Prune Filters based on Con...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11406#discussion_r55307963 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -769,6 +770,28 @@ object CombineFilters extends Rule[LogicalPlan] { } /** + * Remove all the deterministic conditions in a [[Filter]] that are guaranteed to be true + * given the constraints on the child's output. + */ +object PruneFilters extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- Sure, will do it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13527] [SQL] Prune Filters based on Con...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11406#discussion_r55307448 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -769,6 +770,28 @@ object CombineFilters extends Rule[LogicalPlan] { } /** + * Remove all the deterministic conditions in a [[Filter]] that are guaranteed to be true + * given the constraints on the child's output. + */ +object PruneFilters extends Rule[LogicalPlan] with PredicateHelper { --- End diff -- Looks like `SimplifyFilters` is similar in purpose with this rule. Can we merge them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT-FIX][BUILD] Use the new location of `chec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11567#issuecomment-19389 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT-FIX][BUILD] Use the new location of `chec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11567#issuecomment-19395 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52614/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user mwws commented on the pull request: https://github.com/apache/spark/pull/11571#issuecomment-19338 @rxin HiveContext is heavily used by many users now, and many of them still coupled with old spark version. As this change would be trivial but not constructive, I think there is not conflict with the context combination work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT-FIX][BUILD] Use the new location of `chec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11567#issuecomment-193555142 **[Test build #52614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52614/consoleFull)** for PR 11567 at commit [`380ceb3`](https://github.com/apache/spark/commit/380ceb30823ea2fbd76a33538381a64fe3d5171a). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11571#issuecomment-193555075 Thanks for the pull request. So we are actually going to deprecate HiveContext because it has been one of the most confusing contexts in Spark. See more in https://issues.apache.org/jira/browse/SPARK-13485 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13713][SQL] Migrate parser from ANTLR3 ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11557#issuecomment-193554969 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52613/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13713][SQL] Migrate parser from ANTLR3 ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11557#issuecomment-193554967 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13713][SQL] Migrate parser from ANTLR3 ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11557#issuecomment-193554765 **[Test build #52613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52613/consoleFull)** for PR 11557 at commit [`723edfb`](https://github.com/apache/spark/commit/723edfba11c40e832916d90b5d1453c926317022). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ParseException(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13689] [SQL] Move helper things in Cata...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11529 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11571#issuecomment-193553927 **[Test build #52623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52623/consoleFull)** for PR 11571 at commit [`a64a0a4`](https://github.com/apache/spark/commit/a64a0a4bb9dad43b837678f06f45e7a15215826f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13689] [SQL] Move helper things in Cata...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11529#issuecomment-193553008 Merging into master, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13737][SQL][wip]Add getOrCreate method ...
GitHub user mwws opened a pull request: https://github.com/apache/spark/pull/11571 [SPARK-13737][SQL][wip]Add getOrCreate method for HiveContext There is a "getOrCreate" method in SQLContext, which is useful to recoverable streaming application with SQL operation. https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations But the corresponding method is missing in HiveContext. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mwws/spark SPARK-HiveGetOrCreate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11571.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11571 commit a64a0a4bb9dad43b837678f06f45e7a15215826f Author: mwwsDate: 2016-03-08T01:48:38Z Add getOrCreate method for HiveContext --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-529] [sql] Modify SQLConf to use new co...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11570#issuecomment-193552569 **[Test build #52622 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52622/consoleFull)** for PR 11570 at commit [`884926c`](https://github.com/apache/spark/commit/884926c76e0403eca0aba43319eb28c37eca2e66). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-529] [sql] Modify SQLConf to use new co...
GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/11570 [SPARK-529] [sql] Modify SQLConf to use new config API from core. Because SQL keeps track of all known configs, some customization was needed in SQLConf to allow that, since the core API does not have that feature. Tested via existing (and slightly updated) unit tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-529-sql Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11570.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11570 commit 884926c76e0403eca0aba43319eb28c37eca2e66 Author: Marcelo VanzinDate: 2015-12-07T19:54:00Z [SPARK-529] [sql] Modify SQLConf to use new config API from core. Because SQL keeps track of all known configs, some customization was needed in SQLConf to allow that, since the core API does not have that feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13692][CORE][SQL] Fix trivial Coverity/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11530#issuecomment-193552260 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13692][CORE][SQL] Fix trivial Coverity/...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11530#issuecomment-193552263 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52611/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13692][CORE][SQL] Fix trivial Coverity/...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11530#issuecomment-193552052 **[Test build #52611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52611/consoleFull)** for PR 11530 at commit [`9a0f8fa`](https://github.com/apache/spark/commit/9a0f8fabeccf56800dd8af74c39f14a99b8041a7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193551318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52609/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193551316 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193551056 **[Test build #52609 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52609/consoleFull)** for PR 11566 at commit [`398859c`](https://github.com/apache/spark/commit/398859cf12df28a38b1fbf0d740eb14a1af20e63). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193550601 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52618/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193550529 **[Test build #52618 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52618/consoleFull)** for PR 11569 at commit [`d19992b`](https://github.com/apache/spark/commit/d19992b4ec5141221cbf8724dc592b09e541039b). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193550598 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193550315 **[Test build #52621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52621/consoleFull)** for PR 11550 at commit [`c51d4ef`](https://github.com/apache/spark/commit/c51d4efbe72e4713a53ee7706996bef837d79fa5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11565#issuecomment-193550327 **[Test build #52620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52620/consoleFull)** for PR 11565 at commit [`467b095`](https://github.com/apache/spark/commit/467b095d89ce641f568aade09d710fb9ea573273). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193549873 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-193549215 **[Test build #52619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52619/consoleFull)** for PR 11550 at commit [`db27259`](https://github.com/apache/spark/commit/db27259629721f2e584457b4e5739baabfd851ea). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/11565#discussion_r55304511 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -345,8 +343,6 @@ object ColumnPruning extends Rule[LogicalPlan] { // Prunes the unused columns from child of Aggregate/Window/Expand/Generate case a @ Aggregate(_, _, child) if (child.outputSet -- a.references).nonEmpty => a.copy(child = prunedChild(child, a.references)) -case w @ Window(_, _, _, _, child) if (child.outputSet -- w.references).nonEmpty => --- End diff -- First, `w.outputSet` always include `child.outputSet`. Second, `w.references` only include the expressions present in the current `Window` operator. This set does not include attributes that are implicitly referenced by being passed through to the output tuple. Thus, it is not valid now. It will wrongly prune the child, if we keep it. Please correct me if my understanding is wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13404] [SQL] Create variables for input...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11274#issuecomment-193548355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52612/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13404] [SQL] Create variables for input...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11274#issuecomment-193548354 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13404] [SQL] Create variables for input...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11274#issuecomment-193547869 **[Test build #52612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52612/consoleFull)** for PR 11274 at commit [`f431170`](https://github.com/apache/spark/commit/f4311709dd0c66add99aeb248acdc70863fba239). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193545016 **[Test build #52618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52618/consoleFull)** for PR 11569 at commit [`d19992b`](https://github.com/apache/spark/commit/d19992b4ec5141221cbf8724dc592b09e541039b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13577] [yarn] Allow Spark jar to be mul...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11500#issuecomment-193543523 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52607/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13577] [yarn] Allow Spark jar to be mul...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11500#issuecomment-193543521 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13577] [yarn] Allow Spark jar to be mul...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11500#issuecomment-193542877 **[Test build #52607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52607/consoleFull)** for PR 11500 at commit [`9bab2ea`](https://github.com/apache/spark/commit/9bab2ea1fc5bbb91497e1994b3613cd6cdc4b3be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12719][SQL] SQL generation support for ...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/11563#issuecomment-193542558 @rxin. Hi Reynold, We have two cases to handle. ```SQL SELECT explode(array(1,2,3)) FROM src SELECT gentab2.* FROM t1 LATERAL VIEW explode(array(array(1,2,3))) gentab1 AS gencol1 LATERAL VIEW explode(gentab1.gencol1) gentab2 AS gencol2 ``` Currently, I handle the first case in `projToSql` and the 2nd case in `generateToSql`, as we wanted to generate SQLs which is closer to the original SQL. Lateral view also can refer to columns from tables before itself. So i felt it is safer to generate the SQL very close to the source SQL to reduce any risk. I also thought about treating the first case as a special case of LATERAL view. In this case we had to handle the generation of a table alias which is missing in case-1 and fixing up the projection list above to refer to it. However, I went with the approach in this PR as it didn't seem too complex and also retained the layout of the original SQL. I could be easily overlooking something here and would appreciate your guidance. Please let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11108][ML] OneHotEncoder should support...
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/9777#issuecomment-193542138 LGTM aside from that one typo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11108][ML] OneHotEncoder should support...
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/9777#discussion_r55302705 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala --- @@ -132,8 +133,10 @@ class OneHotEncoder(override val uid: String) extends Transformer val numAttrs = dataset.select(col(inputColName).cast(DoubleType)).map(_.getDouble(0)) .aggregate(0.0)( (m, x) => { -assert(x >=0.0 && x == x.toInt, - s"Values from column $inputColName must be indices, but got $x.") +assert(x <= Int.MaxValue, + s"OneHotEncoder only supports up to ${Int.MaxValue} indices, but got $x") +assert(x >= 0.0 && x == x.toInt, + s"Values e column $inputColName must be indices, but got $x.") --- End diff -- values *in column --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11535][ML] handling empty string in Str...
Github user thunterdb commented on the pull request: https://github.com/apache/spark/pull/9522#issuecomment-193542005 @pravingadakh sorry for the delay. Would you mind resolving the conflicts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13249][SQL] Add Filter checking nullabi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11235#issuecomment-193541806 **[Test build #52617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52617/consoleFull)** for PR 11235 at commit [`312cb32`](https://github.com/apache/spark/commit/312cb326922624e95528b7f2dc92129c59b3b524). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13549] [SQL] Refactor the Optimizer Rul...
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/11427#issuecomment-193540108 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13117] [Web UI] WebUI should use the lo...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11490#issuecomment-193539323 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193538403 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193538292 **[Test build #52615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52615/consoleFull)** for PR 11569 at commit [`0ad424b`](https://github.com/apache/spark/commit/0ad424bbcd03bf4c57566dbe92e537db213ba187). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13734][SPARKR] Added histogram function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11569#issuecomment-193538409 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52615/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193537547 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11565#discussion_r55301619 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -345,8 +343,6 @@ object ColumnPruning extends Rule[LogicalPlan] { // Prunes the unused columns from child of Aggregate/Window/Expand/Generate case a @ Aggregate(_, _, child) if (child.outputSet -- a.references).nonEmpty => a.copy(child = prunedChild(child, a.references)) -case w @ Window(_, _, _, _, child) if (child.outputSet -- w.references).nonEmpty => --- End diff -- Isn't it still a valid optimization? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11566#discussion_r55301512 --- Diff: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala --- @@ -125,16 +125,14 @@ private[spark] class AppClient( registerMasterFutures.set(tryRegisterAllMasters()) registrationRetryTimer.set(registrationRetryThread.schedule(new Runnable { override def run(): Unit = { - Utils.tryOrExit { -if (registered.get) { - registerMasterFutures.get.foreach(_.cancel(true)) - registerMasterThreadPool.shutdownNow() -} else if (nthRetry >= REGISTRATION_RETRIES) { - markDead("All masters are unresponsive! Giving up.") -} else { - registerMasterFutures.get.foreach(_.cancel(true)) - registerWithMaster(nthRetry + 1) -} + if (registered.get) { +registerMasterFutures.get.foreach(_.cancel(true)) +registerMasterThreadPool.shutdownNow() + } else if (nthRetry >= REGISTRATION_RETRIES) { +markDead("All masters are unresponsive! Giving up.") --- End diff -- FYI, this line will call `sc.stop()`: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala#L136 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193534923 **[Test build #52616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52616/consoleFull)** for PR 11555 at commit [`656a13a`](https://github.com/apache/spark/commit/656a13a84be56de2a6806296492951016082092e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193534718 > So what happens now if the scheduled Runnable throws an exception? Just go to `Thread.getDefaultUncaughtExceptionHandler()`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13675][UI] Fix wrong historyserver url ...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/11518#issuecomment-193534057 Yes, I also tested with multiple attempts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13404] [SQL] Create variables for input...
Github user nongli commented on the pull request: https://github.com/apache/spark/pull/11274#issuecomment-193534012 sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12718][SPARK-13720][SQL] SQL generation...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11555#issuecomment-193533951 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13732] [SQL] Remove projectList from Wi...
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/11565#issuecomment-193533582 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13711][Core]Don't call SparkUncaughtExc...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/11566#issuecomment-193533661 So what happens now if the scheduled Runnable throws an exception? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11011][SQL] Narrow type of UDT serializ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11379#issuecomment-193531378 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52600/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11011][SQL] Narrow type of UDT serializ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11379#issuecomment-193531376 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][WIP/RFC] Consistent accumu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11105#issuecomment-193531213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52601/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][WIP/RFC] Consistent accumu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11105#issuecomment-193531212 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org