[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-159991288 **[Test build #46778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46778/consoleFull)** for PR 9979 at commit [`1f4605e`](https://github.com/apache/spark/commit/1f4605e7c58623b9ab1559d666eff6041858dbc3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-159991328 https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46778/consoleFull is for hadoop 2.0 test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11973][SQL] Improve optimizer code read...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9995#issuecomment-159995211 **[Test build #46775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46775/consoleFull)** for PR 9995 at commit [`95b2e0d`](https://github.com/apache/spark/commit/95b2e0d866ac7608f638795fd1542686204708d6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11973][SQL] Improve optimizer code read...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9995#issuecomment-159995282 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46775/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11973][SQL] Improve optimizer code read...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9995#issuecomment-159995281 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996173 **[Test build #46779 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46779/consoleFull)** for PR 9571 at commit [`f6bf558`](https://github.com/apache/spark/commit/f6bf5587f6c76753a1e161bbae0050a49bf6d872). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159997153 **[Test build #46780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46780/consoleFull)** for PR 9571 at commit [`1dcbb5f`](https://github.com/apache/spark/commit/1dcbb5fce89a5f5f9a19b846432ac4d9937a23d0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: PR10000?
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1#issuecomment-159997959 Reynold is just claiming all the good numbers for himself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] (Followup) Fix SQLListenerMemory...
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/9991#discussion_r46008320 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala --- @@ -343,6 +343,8 @@ class SQLListenerMemoryLeakSuite extends SparkFunSuite { .set("spark.sql.ui.retainedExecutions", "50") // Set it to 50 to run this test quickly val sc = new SparkContext(conf) try { + // Clear the sql listener created by a previous test suite. + SQLContext.clearSqlListener() --- End diff -- .. I can imagine Zeppelin wanting to purge these, or whatever Spark Kernel is named as. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-16056 **[Test build #46778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46778/consoleFull)** for PR 9979 at commit [`1f4605e`](https://github.com/apache/spark/commit/1f4605e7c58623b9ab1559d666eff6041858dbc3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8610#issuecomment-160001167 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46774/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8610#issuecomment-160001142 **[Test build #46774 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46774/consoleFull)** for PR 8610 at commit [`8232a80`](https://github.com/apache/spark/commit/8232a808e398f3644304822d1824aa0b923090dc). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8610#issuecomment-160001166 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.2] When down...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/9979#discussion_r46005696 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala --- @@ -34,23 +34,54 @@ import org.apache.spark.sql.hive.HiveContext import org.apache.spark.util.{MutableURLClassLoader, Utils} /** Factory for `IsolatedClientLoader` with specific versions of hive. */ -private[hive] object IsolatedClientLoader { +private[hive] object IsolatedClientLoader extends Logging { /** * Creates isolated Hive client loaders by downloading the requested version from maven. */ def forVersion( - version: String, + hiveMetastoreVersion: String, + hadoopVersion: String, config: Map[String, String] = Map.empty, ivyPath: Option[String] = None, sharedPrefixes: Seq[String] = Seq.empty, barrierPrefixes: Seq[String] = Seq.empty): IsolatedClientLoader = synchronized { -val resolvedVersion = hiveVersion(version) -val files = resolvedVersions.getOrElseUpdate(resolvedVersion, - downloadVersion(resolvedVersion, ivyPath)) +val resolvedVersion = hiveVersion(hiveMetastoreVersion) +// We will first try to share Hadoop classes. If we cannot resolve the Hadoop artifact +// with the given version, we will use Hadoop 2.4.0 and then will not share Hadoop classes. +var sharesHadoopClasses = true +val files = if (resolvedVersions.contains((resolvedVersion, hadoopVersion))) { + resolvedVersions((resolvedVersion, hadoopVersion)) +} else { + val (downloadedFiles, actualHadoopVersion) = +try { + (downloadVersion(resolvedVersion, hadoopVersion, ivyPath), hadoopVersion) +} catch { + case e: RuntimeException if e.getMessage.contains("hadoop") => +// If the error message contains hadoop, it is probably because the hadoop +// version cannot be resolved (e.g. it is a vendor specific version like +// 2.0.0-cdh4.1.1). If it is the case, we will try just +// "org.apache.hadoop:hadoop-client:2.4.0". "org.apache.hadoop:hadoop-client:2.4.0" +// is used just because we used to hard code it as the hadoop artifact to download. +logWarning(s"Failed to resolve Hadoop artifacts for the version ${hadoopVersion}. " + + s"We will change the hadoop version from ${hadoopVersion} to 2.4.0 and try again. " + + "Hadoop classes will not be shared between Spark and Hive metastore client. " + + "It is recommended to set jars used by Hive metastore client through " + + "spark.sql.hive.metastore.jars in the production environment.") +sharesHadoopClasses = false +(downloadVersion(resolvedVersion, "2.4.0", ivyPath), "2.4.0") + case throwable: Throwable => +// If it is other causes, we just re-throw the Throwable. +throw throwable --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996339 **[Test build #46779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46779/consoleFull)** for PR 9571 at commit [`f6bf558`](https://github.com/apache/spark/commit/f6bf5587f6c76753a1e161bbae0050a49bf6d872). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11700] [SQL] use weak reference in Spar...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/9990#discussion_r46007594 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -401,7 +402,7 @@ class SQLContext private[sql]( */ @Experimental def createDataFrame[A <: Product : TypeTag](rdd: RDD[A]): DataFrame = { -SparkPlan.currentContext.set(self) +SparkPlan.currentContext.set(new WeakReference[SQLContext](self)) --- End diff -- A `WeakReference` is quite weak; you'll lose the reference on a full GC. Is this not going to cause the code to fail suddenly? I don't know that references are any remedy for a leak. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159998497 **[Test build #46782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46782/consoleFull)** for PR 9571 at commit [`25e77bd`](https://github.com/apache/spark/commit/25e77bd8eb1d8908410802dc2c59274b15f7294b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/10009 [SPARK-12018][SQL] Refactor common subexpression elimination code JIRA: https://issues.apache.org/jira/browse/SPARK-12018 The code of common subexpression elimination can be factored and simplified. Some unnecessary variables can be removed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 refactor-subexpr-eliminate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10009.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10009 commit 4876ce082b8d9a387b041f1a7e8d9060ecb4a777 Author: Liang-Chi HsiehDate: 2015-11-26T22:39:14Z Refactor common subexpression elimination. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: doc typo: "classificaion" -> "classification"
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10008#issuecomment-16899 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996648 jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996634 Added reworked design * all metrics go through the `MetricsSystem`; the providers return an optional `Source` from the `start()` call. * `FsHistoryProvider` metrics split lookups of missing files from failed attempts to replay the logs. * `FsHistoryProvider` metrics include time of the mergeListing() operation, which can be quite the CPU killer. * There's a `HealthSource` for health checks; it's being explicitly managed in the HistoryServer. * And there's an initial health check for the FS, which simply returns FS safe/mode flag. Really, the health check logic needs its own `HealthSystem` for the register/unregister. Trying to design one that spans all the applications is more complex and I'm trying to avoid that. Furthermore, those operations which fail asynchronous and just have exceptions logged should have their exceptions saved to another health check (I may implement that). That way the health check will react to live system failures, including things like the log directory being deleted during a run. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: PR10000?
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1#issuecomment-159998570 :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: PR10000?
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1#issuecomment-159998606 Now I gotta think hard about what I should submit using this branch when I reopen the pull request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-15278 **[Test build #46777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46777/consoleFull)** for PR 10001 at commit [`af508de`](https://github.com/apache/spark/commit/af508deafa1f1ba3ff46fc98a0e2fbbc81c6a4c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996343 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46779/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159996342 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-159997461 **[Test build #46781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46781/consoleFull)** for PR 9571 at commit [`1dcbb5f`](https://github.com/apache/spark/commit/1dcbb5fce89a5f5f9a19b846432ac4d9937a23d0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user steveloughran commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-16264 (Note that the POMs changed to pull in some more of the codahale servlets, though only the health checks & thread dump are being registered. Hooking up those servlets through the MetricsSystem may be a bit tricky; and as it has its own metrics, is probably the one to go for. In which case: wrapping up JVM stats as a `Source` would give automatic access to those numbers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11638] [Mesos + Docker Bridge networkin...
Github user radekg commented on the pull request: https://github.com/apache/spark/pull/9608#issuecomment-16205 I've added the code for 1.6. It works (tasks are successfully finishing). However, I am not 100% sure what is the impact of this change. Would be great if somebody accustomed with NettyRpcEnv could cross check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11638] [Mesos + Docker Bridge networkin...
Github user radekg commented on the pull request: https://github.com/apache/spark/pull/9608#issuecomment-16247 Regarding the `TorrentBroadcast`. I think there is to be some magic done around `blockManager.port`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-159986618 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-159987537 **[Test build #46777 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46777/consoleFull)** for PR 10001 at commit [`af508de`](https://github.com/apache/spark/commit/af508deafa1f1ba3ff46fc98a0e2fbbc81c6a4c8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-159988308 @dilipbiswal Do you know why it is broken? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12010][SQL] Spark JDBC requires support...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10003#discussion_r46005296 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/ProgressCassandraDialect.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.jdbc + +import java.sql.Types + +import org.apache.spark.sql.types._ + + +private case object ProgressCassandraDialect extends JdbcDialect { --- End diff -- Are there other Cassandra jdbc drivers and is this the primary one people use? I'm thinking we should just drop "Progress" here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-159998104 @yhuai Hi Yin, in the discoverPartitions method, we are trying to create the partition spec and are trying to cast a partition value to the corresponding user specified schema's column type. In case of null partition value the following code raises a null poiner exception in row.getString(i). Cast(Literal.create(row.getString(i), StringType), userProvidedSchema.fields(i).dataType).eval() in this fix i am trying to check for null first and create a null literal of string type as opposed to calling row.getString(). Hope it is okay .. Please let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-15342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46777/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-15341 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-16105 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-16106 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46778/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: doc typo: "classificaion" -> "classification"
GitHub user muxator opened a pull request: https://github.com/apache/spark/pull/10008 doc typo: "classificaion" -> "classification" You can merge this pull request into a Git repository by running: $ git pull https://github.com/muxator/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10008 commit ebc587f698edfb1963e5f6f64199beb8137cb2ff Author: muxatorDate: 2015-11-26T22:41:04Z doc typo: "classificaion" -> "classification" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11990][SQL] Don't collapse projections ...
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/9993 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11990][SQL] Don't collapse projections ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9993#issuecomment-16806 @marmbrus.Agreed. Thanks. I've not noticed there is introduced common subexpression elimination in codegen recently. Close this now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10009#discussion_r46009403 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -95,12 +95,15 @@ abstract class Expression extends TreeNode[Expression] { ctx.subExprEliminationExprs.get(this).map { subExprState => // This expression is repeated meaning the code to evaluated has already been added // as a function, `subExprState.fnName`. Just call that. + val isNull = ctx.freshName("isNull") + val primitive = ctx.freshName("primitive") --- End diff -- Actually, the above variables are not unnecessary. I will remove them later if the test is passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: doc typo: "classificaion" -> "classification"
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10008#issuecomment-160001253 Trivial, but OK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: PR10000?
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/1#issuecomment-160001389 Streaming DataFrames? I think this is a very significant feature! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10009#issuecomment-160001731 **[Test build #46783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46783/consoleFull)** for PR 10009 at commit [`4876ce0`](https://github.com/apache/spark/commit/4876ce082b8d9a387b041f1a7e8d9060ecb4a777). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11401] [MLLIB] PMML export for Logistic...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9397#issuecomment-160004312 **[Test build #46784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46784/consoleFull)** for PR 9397 at commit [`fd38551`](https://github.com/apache/spark/commit/fd385515ada6f83fcbd7bd70f5083cd7f7529a33). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8610#issuecomment-160005256 **[Test build #46785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46785/consoleFull)** for PR 8610 at commit [`8232a80`](https://github.com/apache/spark/commit/8232a808e398f3644304822d1824aa0b923090dc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-160007597 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-160008526 OK. I am merging this to master and branch 1.6. I will watch the builds and see if there is any new issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/9979#issuecomment-160008842 hmm... not sure why https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46778/consoleFull used hadoop 2.3 profile. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/10001#discussion_r46011037 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala --- @@ -606,9 +606,17 @@ abstract class HadoopFsRelation private[sql]( // we need to cast into the data type that user specified. def castPartitionValuesToUserSchema(row: InternalRow) = { InternalRow((0 until row.numFields).map { i => -Cast( - Literal.create(row.getString(i), StringType), - userProvidedSchema.fields(i).dataType).eval() +row.isNullAt(i) match { --- End diff -- I think it's better to check null in `InternalRow.getString()` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12020] [TESTS] [test-hadoop2.0] PR buil...
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/10010 [SPARK-12020] [TESTS] [test-hadoop2.0] PR builder cannot trigger hadoop 2.0 test https://issues.apache.org/jira/browse/SPARK-12020 You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark SPARK-12020 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10010.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10010 commit 38194ea3036a80e1352f8edebba183412361c403 Author: Yin HuaiDate: 2015-11-27T00:23:10Z fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-160011060 @dilipbiswal I guess I did not ask my question clearly. I meant why 1.5 is good but 1.6 is broken. There must be a change that exposed this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11700] [SQL] Remove thread local SQLCon...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/9990#issuecomment-160011485 @zsxwing @srowen I changed this to use setActive()/getActiveContext(), --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7857][MLLIB] Prevent IDFModel from retu...
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9843#discussion_r46011586 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/IDF.scala --- @@ -211,14 +213,16 @@ private object IDFModel { val n = v.size v match { case SparseVector(size, indices, values) => -val nnz = indices.size -val newValues = new Array[Double](nnz) +val newElements = new ArrayBuffer[(Int, Double)] var k = 0 -while (k < nnz) { - newValues(k) = values(k) * idf(indices(k)) +while (k < indices.size) { + val newValue = values(k) * idf(indices(k)) --- End diff -- Since `idf` can be sparse, lookup will not be constant time operation. You need to make it 4 cases. Also, `indices.size` is not good since in each tight iteration, the `size` method will be called. The following code should work, but I just wrote it here without testing so may have some bugs. You need to add the tests for four different scenarios. ```scala (idf, v) match { case (didf: DenseVector, dv: DenseVector) => val didfValues = didf.values val dvValues = dv.values val newValues = new Array[Double](n) var j = 0 while (j < n) { newValues(j) = dvValues(j) * didfValues(j) j += 1 } Vectors.dense(newValues) case (didf: DenseVector, sv: SparseVector) => val didfValues = didf.values val svIndices = sv.indices val svValues = sv.values val svNnz = svIndices.length val newValues = new Array[Double](svNnz) var k = 0 while (k < svNnz) { newValues(k) = svValues(k) * didfValues(svIndices(k)) k += 1 } Vectors.sparse(n, svIndices, newValues) case (sidf: SparseVector, dv: DenseVector) => val dvValues = dv.values val sidfIndices = sidf.indices val sidfValues = sidf.values val sidfNnz = sidfIndices.length val newValues = new Array[Double](sidfNnz) var k = 0 while (k < sidfNnz) { newValues(k) = sidfValues(k) * dvValues(sidfIndices(k)) k += 1 } Vectors.sparse(n, sidfIndices, newValues) case (sidf: SparseVector, sv: SparseVector) => val (largeIndices, largeValues, largeNnz, smallIndices, smallValues, smallNnz) = { val sidfIndices = sidf.indices val sidfValues = sidf.values val sidfNnz = sidfIndices.length val svIndices = sv.indices val svValues = sv.values val svNnz = svIndices.length if (sidfNnz > svNnz) { (sidfIndices, sidfValues, sidfNnz, svIndices, svValues, svNnz) } else { (svIndices, svValues, svNnz, sidfIndices, sidfValues, sidfNnz) } } val newIndices = new ArrayBuffer[Int](smallNnz) val newValues = new ArrayBuffer[Double](smallNnz) var j = 0 var k = 0 while (j < smallNnz && k < largeNnz) { val smallIndex = smallIndices(j) val largeIndex = largeIndices(k) if (smallIndex == largeIndex) { newIndices.append(smallIndex) newValues.append(smallValues(j) * largeValues(k)) j += 1 k += 1 } else if (smallIndex > largeIndex) { k += 1 } else { j += 1 } } Vectors.sparse(n, newIndices.toArray, newValues.toArray) case (idfOther, vOther) => throw new UnsupportedOperationException( s"Only sparse and dense vectors are supported but got ${idfOther.getClass} " + s"for idf vector, and ${vOther.getClass} for term frequence vector.") } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11781][SPARKR] SparkR has problem in in...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9769#discussion_r46011950 --- Diff: R/pkg/R/DataFrame.R --- @@ -700,25 +700,28 @@ setMethod("collect", # data of complex type can be held. But getting a cell from a column # of list type returns a list instead of a vector. So for columns of # non-complex type, append them as vector. +# +# For columns of complex type, be careful to access them. +# Get a column of complex type returns a list. +# Get a cell from a column of complex type returns a list instead of a vector. col <- listCols[[colIndex]] +colName <- dtypes[[colIndex]][[1]] if (length(col) <= 0) { - df[[names[colIndex]]] <- col + df[[colName]] <- col } else { - # TODO: more robust check on column of primitive types - vec <- do.call(c, col) - if (class(vec) != "list") { -df[[names[colIndex]]] <- vec + colType <- dtypes[[colIndex]][[2]] + if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") { +vec <- do.call(c, col) +stopifnot (class(vec) != "list") --- End diff -- nit: no space for func call: `stopifnot(class(vec) != "list")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5682][Core] Add encrypted shuffle in sp...
Github user winningsix commented on the pull request: https://github.com/apache/spark/pull/8880#issuecomment-160014509 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-160017383 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11997][SQL] NPE when save a DataFrame a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10001#issuecomment-160017247 **[Test build #46791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46791/consoleFull)** for PR 10001 at commit [`4de7697`](https://github.com/apache/spark/commit/4de7697753f0da6810190bea804b9f490a68bb98). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6518][MLlib][Example] Add example code ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9952#issuecomment-160018805 **[Test build #46790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46790/consoleFull)** for PR 9952 at commit [`86f6085`](https://github.com/apache/spark/commit/86f608546bfac909eb4e99bf49816596767feac8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11996][Core]Make the executor thread du...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9976#issuecomment-160020704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46786/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7857][MLLIB] Prevent IDFModel from retu...
Github user karlhigley commented on a diff in the pull request: https://github.com/apache/spark/pull/9843#discussion_r46013493 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/IDF.scala --- @@ -211,14 +213,16 @@ private object IDFModel { val n = v.size v match { case SparseVector(size, indices, values) => -val nnz = indices.size -val newValues = new Array[Double](nnz) +val newElements = new ArrayBuffer[(Int, Double)] var k = 0 -while (k < nnz) { - newValues(k) = values(k) * idf(indices(k)) +while (k < indices.size) { + val newValue = values(k) * idf(indices(k)) --- End diff -- Good point about `indices.size`, I've added a commit to move that outside the loop. The purpose of this ticket and PR is to remove explicit zeros from the output of `IDFModel.transform`. There may be further performance optimizations to be done in this section of code (e.g. the four cases as you suggest). It's not clear to me that the code in this PR is significantly less performant than what was already there, or that the suggested optimized code addresses the original issue. Maybe I'm missing something? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5337][Mesos][Standalone] respect spark....
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/8610#issuecomment-160027038 Leave a mark here, there is something wrong with a particular test case causing the above failures, more details: https://issues.apache.org/jira/browse/SPARK-12021?jql=project%20%3D%20SPARK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11973][SQL] Improve optimizer code read...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9995 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11700] [SQL] Remove thread local SQLCon...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9990#issuecomment-160032280 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11700] [SQL] Remove thread local SQLCon...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9990#issuecomment-160032215 **[Test build #46789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46789/consoleFull)** for PR 9990 at commit [`a8d096d`](https://github.com/apache/spark/commit/a8d096dd6a49e3eab3c5ffdb4e9ad38c820efc0d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11700] [SQL] Remove thread local SQLCon...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9990#issuecomment-160032281 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46789/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11996][Core]Make the executor thread du...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9976 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12011] [SQL] Stddev/Variance etc should...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9994 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] (Followup) Fix SQLListenerMemory...
Github user carsonwang commented on a diff in the pull request: https://github.com/apache/spark/pull/9991#discussion_r46014967 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/ui/SQLListenerSuite.scala --- @@ -343,6 +343,8 @@ class SQLListenerMemoryLeakSuite extends SparkFunSuite { .set("spark.sql.ui.retainedExecutions", "50") // Set it to 50 to run this test quickly val sc = new SparkContext(conf) try { + // Clear the sql listener created by a previous test suite. + SQLContext.clearSqlListener() --- End diff -- I think we can add a SparkContext stop hook. When SparkContext is being stopped, clear the reference. The user doesn't have to call a method to clear the sqlListener reference. The sqlListener is added to SparkContext and will only be garbage collected when SparkContext is stopped. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11917][PYSPARK] Add SQLContext#dropTemp...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9903#issuecomment-160033387 Thanks - merging this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11856][SQL] add type cast if the real t...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/9840#discussion_r46015611 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala --- @@ -85,12 +85,24 @@ case class DecimalType(precision: Int, scale: Int) extends FractionalType { private[sql] def isWiderThan(other: DataType): Boolean = other match { case dt: DecimalType => (precision - scale) >= (dt.precision - dt.scale) && scale >= dt.scale -case dt: IntegralType => --- End diff -- I'm sure if this is by intention. Why we ignore fraction type? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9478#issuecomment-160034859 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46794/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11788][SQL]:surround timestamp/date val...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9872#issuecomment-160034888 **[Test build #2122 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2122/consoleFull)** for PR 9872 at commit [`ece3838`](https://github.com/apache/spark/commit/ece383837b9ed7d176d35f10460f7208184056bd). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/10009#discussion_r46015873 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -104,16 +104,13 @@ class CodeGenContext { val equivalentExpressions: EquivalentExpressions = new EquivalentExpressions // State used for subexpression elimination. - case class SubExprEliminationState( - isLoaded: String, - code: GeneratedExpressionCode, - fnName: String) + case class SubExprEliminationState(isNull: String, value: String) // Foreach expression that is participating in subexpression elimination, the state to use. val subExprEliminationExprs = mutable.HashMap.empty[Expression, SubExprEliminationState] - // The collection of isLoaded variables that need to be reset on each row. - val subExprIsLoadedVariables = mutable.ArrayBuffer.empty[String] + // The collection of variables that need to be reset on each row. --- End diff -- how about `The collection of sub-exression result reset method that need to be called on each row.`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11863][SQL] Unable to resolve order by ...
Github user dilipbiswal commented on the pull request: https://github.com/apache/spark/pull/9961#issuecomment-160002951 @marmbrus @cloud-fan Thank you !! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-160005758 **[Test build #46782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46782/consoleFull)** for PR 9571 at commit [`25e77bd`](https://github.com/apache/spark/commit/25e77bd8eb1d8908410802dc2c59274b15f7294b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-160005781 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46782/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-160007559 **[Test build #46781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46781/consoleFull)** for PR 9571 at commit [`1dcbb5f`](https://github.com/apache/spark/commit/1dcbb5fce89a5f5f9a19b846432ac4d9937a23d0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11373] [CORE] WiP Add metrics to the Hi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9571#issuecomment-160007599 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46781/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11996][Core]Make the executor thread du...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/9976#issuecomment-160008050 > Test build #46776 has finished for PR 9976 at commit d626bfc. > > This patch fails from timeout after a configured wait of `250m`. > This patch merges cleanly. > This patch adds no public classes. Not sure why this happens frequently. I saw some weird logs in the build: ``` [info] MQTTStreamSuite: [info] - mqtt input stream (1 second, 879 milliseconds) [info] Test run started [info] Test org.apache.spark.streaming.mqtt.JavaMQTTStreamSuite.testMQTTStream started [info] Test run finished: 0 failed, 0 ignored, 1 total, 0.292s [info] ScalaTest [info] Run completed in 18 minutes, 2 seconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [info] Passed: Total 2, Failed 0, Errors 0, Passed 2 ``` Just several seconds for the tests but the total time was 18 minutes. Is this because the global ivy lock, such as, other build in the same machine took a lot of time to resolve the dependencies? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11996][Core]Make the executor thread du...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/9976#issuecomment-160008054 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11998] [SQL] [test-hadoop2.0] When down...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9979 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11781][SPARKR] SparkR has problem in in...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/9769#discussion_r46012852 --- Diff: R/pkg/R/DataFrame.R --- @@ -700,25 +700,28 @@ setMethod("collect", # data of complex type can be held. But getting a cell from a column # of list type returns a list instead of a vector. So for columns of # non-complex type, append them as vector. +# +# For columns of complex type, be careful to access them. +# Get a column of complex type returns a list. +# Get a cell from a column of complex type returns a list instead of a vector. col <- listCols[[colIndex]] +colName <- dtypes[[colIndex]][[1]] if (length(col) <= 0) { - df[[names[colIndex]]] <- col + df[[colName]] <- col } else { - # TODO: more robust check on column of primitive types - vec <- do.call(c, col) - if (class(vec) != "list") { -df[[names[colIndex]]] <- vec + colType <- dtypes[[colIndex]][[2]] + if (!is.null(PRIMITIVE_TYPES[[colType]]) && colType != "binary") { +vec <- do.call(c, col) +stopifnot (class(vec) != "list") --- End diff -- fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5682][Core] Add encrypted shuffle in sp...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8880#issuecomment-160018955 **[Test build #46793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46793/consoleFull)** for PR 8880 at commit [`8b0aa5e`](https://github.com/apache/spark/commit/8b0aa5e647f5b8a47cbe45cd4b582130b82886d6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6518][MLlib][Example] Add example code ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9952#issuecomment-160018847 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6518][MLlib][Example] Add example code ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9952#issuecomment-160018848 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46790/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11781][SPARKR] SparkR has problem in in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9769#issuecomment-160023496 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46792/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11781][SPARKR] SparkR has problem in in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9769#issuecomment-160023492 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11781][SPARKR] SparkR has problem in in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9769#issuecomment-160023298 **[Test build #46792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46792/consoleFull)** for PR 9769 at commit [`f073c3a`](https://github.com/apache/spark/commit/f073c3aede9fd258d2354db04d5eae7c14e40c25). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6521][Core] Bypass unnecessary network ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9478#issuecomment-160023236 **[Test build #46794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46794/consoleFull)** for PR 9478 at commit [`ba94687`](https://github.com/apache/spark/commit/ba94687b48d08fc6a4c863fbafeb5d39181cc53c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: doc typo: "classificaion" -> "classification"
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10008#issuecomment-160031822 Thanks for fixing it. I'm going to merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12020] [TESTS] [test-hadoop2.0] PR buil...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10010#issuecomment-160031786 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11881][SQL] Fix for postgresql fetchsiz...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/9861#discussion_r46014887 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -489,6 +494,13 @@ private[sql] class JDBCRDD( } try { if (null != conn) { + if (!conn.getAutoCommit && !conn.isClosed) { +try { + conn.commit() +} catch { + case e: Throwable => logWarning("Exception committing transaction", e) --- End diff -- this should catch nonfatal - i will fix it when merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11881][SQL] Fix for postgresql fetchsiz...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9861#issuecomment-160033166 LGTM - going to merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10009#issuecomment-160033692 **[Test build #46795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46795/consoleFull)** for PR 10009 at commit [`b3cf6a8`](https://github.com/apache/spark/commit/b3cf6a8ad94e2ba37c60ccda99f830160fa464d6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/10009#discussion_r46015031 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -417,18 +413,12 @@ class CodeGenContext { val code = expr.gen(this) val fn = s""" - |private void $fnName(InternalRow ${INPUT_ROW}) { - | if (!$isLoaded) { - |${code.code.trim} - |$isLoaded = true; - |$isNull = ${code.isNull}; - |$value = ${code.value}; - | } + |private ${javaType(expr.dataType)} $fnName(InternalRow ${INPUT_ROW}) { + | ${code.code.trim} + | $isNull = ${code.isNull}; + | return ${code.value}; --- End diff -- can we make this method return void? we can just assign values to `isNull` and `value` as they are both member variables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12018][SQL] Refactor common subexpressi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/10009#discussion_r46015118 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -417,18 +413,12 @@ class CodeGenContext { val code = expr.gen(this) val fn = s""" - |private void $fnName(InternalRow ${INPUT_ROW}) { - | if (!$isLoaded) { - |${code.code.trim} - |$isLoaded = true; - |$isNull = ${code.isNull}; - |$value = ${code.value}; - | } + |private ${javaType(expr.dataType)} $fnName(InternalRow ${INPUT_ROW}) { + | ${code.code.trim} + | $isNull = ${code.isNull}; + | return ${code.value}; --- End diff -- and then we can remove the `subExprInitVariables` because the reset is just calling this function. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11778][SQL]:add regression test
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9890#issuecomment-160033616 Thanks - I'm merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org