[GitHub] spark pull request: SPARK-1325. The maven build error for Spark To...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/240#issuecomment-38776405 Uh, create a different PR is a good idea --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1325: The maven build error for Spar...
GitHub user witgo reopened a pull request: https://github.com/apache/spark/pull/234 Fix SPARK-1325: The maven build error for Spark Tools You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1325 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #234 commit 24ce39e6994bde3c4e3cf7b5a54c4a9dde2b00ba Author: witgo wi...@qq.com Date: 2014-03-26T03:59:03Z Fix SPARK-1325 commit a416d64f5f7f6dd3c75483f019f20481334f9aaa Author: witgo wi...@qq.com Date: 2014-03-26T05:09:23Z Modify scala-actors to scope test commit bd3e72e84c564b3306cbba367317a2e62acbe5b6 Author: witgo wi...@qq.com Date: 2014-03-27T02:08:56Z Merge branch 'master' of https://github.com/apache/spark into SPARK-1325 commit fad556079741cc1df65ad3c8a22a13355039be81 Author: witgo wi...@qq.com Date: 2014-03-27T02:12:28Z Merge master commit 85df1344eed004c833df2d5b1ced58eb4b526504 Author: witgo wi...@qq.com Date: 2014-03-27T15:20:48Z Delete redundant dependency --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/234 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1413: Parquet messes up stdout and s...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/325 Fix SPARK-1413: Parquet messes up stdout and stdin when used in Spark REPL You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1413 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/325.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #325 commit 70f3c6445afd4267e297a9273daf3a56e5808112 Author: witgo wi...@qq.com Date: 2014-04-04T09:02:34Z Fix SPARK-1413 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Fix SPARK-1413: Parquet messes up stdout...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-39595364 There is a problem I do not know what the reason is cause import org.apache.spark.SparkContext import org.apache.spark.sql.SQLContext case class Record(key: Int, value: String) val sqlContext = new SQLContext(sc) import sqlContext._ val rdd = sc.parallelize((1 to 100).map(i = Record(i, sval_$i))) rdd.registerAsTable(records) rdd.saveAsParquetFile(records.parquet) = SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Fix SPARK-1413: Parquet messes up stdout...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-39619758 [class parquet.Log](https://github.com/Parquet/parquet-mr/blob/master/parquet-common/src/main/java/parquet/Log.java) has a static block ( add a default handler in case there is none ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Fix SPARK-1413: Parquet messes up stdout...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-39637576 `Seq(parquet.hadoop.ColumnChunkPageWriteStore, ...` parent logger is `Logger.getLogger(parquet)`. only need to set the `Logger.getLogger(parquet)` . `LogManager.getLogManager.reset ()` call twice seems to cause some problems --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: remove scalalogging-slf4j dependency
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/332 remove scalalogging-slf4j dependency You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark remove_scalalogging Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/332.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #332 commit eb93ee2e88cfe030da1eb2af01c5fb04ddb6d647 Author: witgo wi...@qq.com Date: 2014-04-05T14:21:18Z remove scalalogging-slf4j dependency --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1420 The maven build error for Spark...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/333 Fix SPARK-1420 The maven build error for Spark Catalyst You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1420 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/333.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #333 commit 902519eb58cf5b56ecab5fd0b7a3f26cc73a4933 Author: witgo wi...@qq.com Date: 2014-04-05T15:32:03Z add dependency scala-reflect to catalyst --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Spark logger moving to use scala-logging
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-39701040 I did not find the call that affect performance It is possible that here: Spark Catalyst `logger.debug` is called many times May be like this [RuleExecutor.scala#L64](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L64) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Fix SPARK-1413: Parquet messes up stdout...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-39706859 @AndreSchumacher [parquet.Log](https://github.com/Parquet/parquet-mr/blob/master/parquet-common/src/main/java/parquet/Log.java) has a static block ( add a default handler in case there is none ) The following code unset `Logger.getLogger(parquet)` val parquetLogger = java.util.logging.Logger.getLogger(parquet) parquetLogger.getHandlers.foreach(parquetLogger.removeHandler) if(parquetLogger.getLevel != null) parquetLogger.setLevel(null) if(!parquetLogger.getUseParentHandlers) parquetLogger.setUseParentHandlers(true) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-39816405 Good job, guys. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix:SPARK-1441 Compile Spark Core error with H...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/357 Fix:SPARK-1441 Compile Spark Core error with Hadoop 0.23.x You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1441 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/357.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #357 commit 7de20115cf699be623c4362377637422d7416289 Author: witgo wi...@qq.com Date: 2014-04-08T06:38:20Z add avro dependency to core project --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1413: Parquet messes up stdout and s...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-39819593 @AndreSchumacher You're right,the code has been modified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1413: Parquet messes up stdout and s...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/325#discussion_r11467386 --- Diff: core/src/main/scala/org/apache/spark/Logging.scala --- @@ -135,4 +136,6 @@ trait Logging { private object Logging { @volatile private var initialized = false val initLock = new Object() + SLF4JBridgeHandler.removeHandlersForRootLogger() --- End diff -- Yes, here,remove handlers for rootLogger directly is somewhat arbitrary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1413: Parquet messes up stdout and s...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/325#discussion_r11472274 --- Diff: core/src/main/scala/org/apache/spark/Logging.scala --- @@ -135,4 +136,6 @@ trait Logging { private object Logging { @volatile private var initialized = false val initLock = new Object() + // SLF4JBridgeHandler.removeHandlersForRootLogger() --- End diff -- You are right,I don't understand the scala reflection, an example? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix SPARK-1413: Parquet messes up stdout and s...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/325#issuecomment-40052128 @pwendell Thank you, this patch works But this only solve one problem. The Spark the dependence of the is fixed, we can only use log4j. In SparkBuild.scala file: log4j % log4j% 1.2.17, org.slf4j % slf4j-api% slf4jVersion, org.slf4j % slf4j-log4j12% slf4jVersion, org.slf4j % jul-to-slf4j % slf4jVersion, org.slf4j % jcl-over-slf4j % slf4jVersion, To become a configurable logging system? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-40221509 How to let Jenkins to run the test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Add the lifecycle interface
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/379#issuecomment-40224441 @andrewor14 I don't speak english well. On weekends, I will write a document. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] SPARK-1477: Add the lifecycle interface
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/379#issuecomment-40279118 @andrewor14 ,@tdas, mind reviewing this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1441: Compile Spark Core error with Hado...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/357#issuecomment-40301947 @srowen mind reviewing the PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1441: Compile Spark Core error with Hado...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/357#issuecomment-40304913 So, if someone compile the spark with hadoop 0.23.x how to automatically activate the profile ```xml profile idyarn-alpha/id dependencies dependency groupIdorg.apache.avro/groupId artifactIdavro/artifactId /dependency /dependencies /profile ``` Maven does not support such a activation ```xml activation propertynamehadoop.version/namevalue0.23.*/value/property /activation ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1441: Compile Spark Core error with Hado...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/357#issuecomment-40305799 ```xml activation propertynamehadoop.version/namevalue[0.23,0.24)/value/property /activation ``` It doesn't work see [PropertyProfileActivator.java](https://github.com/apache/maven/blob/master/maven-model-builder/src/main/java/org/apache/maven/model/profile/activation/PropertyProfileActivator.java) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1477: Add the lifecycle interface
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/379#issuecomment-40327489 @andrewor14 ,@tdas , @pwendell ,mind reviewing the PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Make distribution
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/412 Make distribution You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark make_distribution Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/412.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #412 commit 709d71945a75a11fea6d91fd97ded30bd98b2950 Author: witgo wi...@qq.com Date: 2014-04-15T07:26:30Z add with-hive argument to make-distribution.sh commit 6d344c8e35f28a2bb1063bbd24057e256d3fa2f2 Author: witgo wi...@qq.com Date: 2014-04-15T07:29:29Z add with-hive argument to make-distribution.sh --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: improve the readability of SparkContext.scala
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/414 improve the readability of SparkContext.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SparkContext Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/414.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #414 commit a9d7cd0b4e6cf9e2c08a22cdd9e4d61b86ec55bc Author: witgo wi...@qq.com Date: 2014-04-14T07:57:08Z improve the readability of SparkContext code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: improve the readability of SparkContext.scala
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/414#issuecomment-40675137 From the side of the life cycle for component, only stop method is incomplete, and the initialize code , start code written separately is better --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Clean up and simplify Spark configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/299#discussion_r11720287 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -123,6 +142,14 @@ object SparkSubmit { val options = List[OptionAssigner]( new OptionAssigner(appArgs.master, ALL_CLUSTER_MGRS, false, sysProp = spark.master), + + new OptionAssigner(appArgs.driverExtraClassPath, STANDALONE | YARN, true, +sysProp = spark.driver.extraClassPath), + new OptionAssigner(appArgs.driverExtraJavaOptions, STANDALONE | YARN, true, +sysProp = spark.driver.extraJavaOpts), + new OptionAssigner(appArgs.driverExtraLibraryPath, STANDALONE | YARN, true, +sysProp = spark.driver.extraLibraryPath), + new OptionAssigner(appArgs.driverMemory, YARN, true, clOption = --driver-memory), new OptionAssigner(appArgs.name, YARN, true, clOption = --name), --- End diff -- If we can add a OptionAssigner to set spark.App.Name eg: ```scala new OptionAssigner(appArgs.name, STANDALONE | MESOS | YARN, false, sysProp = spark.app.name), ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1522] : YARN ClientBase throws a NPE if...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/433#discussion_r11726876 --- Diff: project/SparkBuild.scala --- @@ -52,7 +52,7 @@ object SparkBuild extends Build { val SCALAC_JVM_VERSION = jvm-1.6 val JAVAC_JVM_VERSION = 1.6 - lazy val root = Project(root, file(.), settings = rootSettings) aggregate(allProjects: _*) --- End diff -- So better? ```scala lazy val spark = Project(spark, file(.), settings = rootSettings) aggregate(allProjects: _*) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: pom.xml modifications added to SparkBuild.scal...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/435 pom.xml modifications added to SparkBuild.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SparkBuild Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/435.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #435 commit 6851becba7058571816848326713fa8d08998e5d Author: witgo wi...@qq.com Date: 2014-04-17T13:45:48Z Maintain consistent SparkBuild.scala, pom.xml --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-40863767 @marmbrus You are right ,macros is faster than the inline method. There's a short gap. the [test code](https://github.com/witgo/spark/blob/logger/core/src/test/scala/org/apache/spark/LoggingSuite.scala) the results = ``` 1 logString: 38987.745 ms logInline: 14.198 ms logFunction0: 16.22 ms logMacros: 12.312 ms 2 logString: 39037.412 ms logInline: 14.349 ms logFunction0: 16.2 ms logMacros: 12.249 ms 3 logString: 39169.036 ms logInline: 14.324 ms logFunction0: 16.871 ms logMacros: 11.324 ms ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions
GitHub user witgo reopened a pull request: https://github.com/apache/spark/pull/234 Fix org.scala-lang: * inconsistent versions You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1325 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #234 commit 841721e03cc44ee7d8fe72c882db8c0f9f3af365 Author: Patrick Wendell pwend...@gmail.com Date: 2014-03-31T19:07:14Z SPARK-1352: Improve robustness of spark-submit script 1. Better error messages when required arguments are missing. 2. Support for unit testing cases where presented arguments are invalid. 3. Bug fix: Only use environment varaibles when they are set (otherwise will cause NPE). 4. A verbose mode to aid debugging. 5. Visibility of several variables is set to private. 6. Deprecation warning for existing scripts. Author: Patrick Wendell pwend...@gmail.com Closes #271 from pwendell/spark-submit and squashes the following commits: 9146def [Patrick Wendell] SPARK-1352: Improve robustness of spark-submit script commit 5731af5be65ccac831445f351baf040a0d007687 Author: Michael Armbrust mich...@databricks.com Date: 2014-03-31T22:23:46Z [SQL] Rewrite join implementation to allow streaming of one relation. Before we were materializing everything in memory. This also uses the projection interface so will be easier to plug in code gen (its ported from that branch). @rxin @liancheng Author: Michael Armbrust mich...@databricks.com Closes #250 from marmbrus/hashJoin and squashes the following commits: 1ad873e [Michael Armbrust] Change hasNext logic back to the correct version. 8e6f2a2 [Michael Armbrust] Review comments. 1e9fb63 [Michael Armbrust] style bc0cb84 [Michael Armbrust] Rewrite join implementation to allow streaming of one relation. commit 33b3c2a8c6c71b89744834017a183ea855e1697c Author: Patrick Wendell pwend...@gmail.com Date: 2014-03-31T23:25:43Z SPARK-1365 [HOTFIX] Fix RateLimitedOutputStream test This test needs to be fixed. It currently depends on Thread.sleep() having exact-timing semantics, which is not a valid assumption. Author: Patrick Wendell pwend...@gmail.com Closes #277 from pwendell/rate-limited-stream and squashes the following commits: 6c0ff81 [Patrick Wendell] SPARK-1365: Fix RateLimitedOutputStream test commit 564f1c137caf07bd1f073ec6c93551dcad935ee5 Author: Sandy Ryza sa...@cloudera.com Date: 2014-04-01T02:56:31Z SPARK-1376. In the yarn-cluster submitter, rename args option to arg Author: Sandy Ryza sa...@cloudera.com Closes #279 from sryza/sandy-spark-1376 and squashes the following commits: d8aebfa [Sandy Ryza] SPARK-1376. In the yarn-cluster submitter, rename args option to arg commit 94fe7fd4fa9749cb13e540e4f9caf28de47eaf32 Author: Andrew Or andrewo...@gmail.com Date: 2014-04-01T04:42:36Z [SPARK-1377] Upgrade Jetty to 8.1.14v20131031 Previous version was 7.6.8v20121106. The only difference between Jetty 7 and Jetty 8 is that the former uses Servlet API 2.5, while the latter uses Servlet API 3.0. Author: Andrew Or andrewo...@gmail.com Closes #280 from andrewor14/jetty-upgrade and squashes the following commits: dd57104 [Andrew Or] Merge github.com:apache/spark into jetty-upgrade e75fa85 [Andrew Or] Upgrade Jetty to 8.1.14v20131031 commit ada310a9d3d5419e101b24d9b41398f609da1ad3 Author: Andrew Or andrewo...@gmail.com Date: 2014-04-01T06:01:14Z [Hot Fix #42] Persisted RDD disappears on storage page if re-used If a previously persisted RDD is re-used, its information disappears from the Storage page. This is because the tasks associated with re-using the RDD do not report the RDD's blocks as updated (which is correct). On stage submit, however, we overwrite any existing information regarding that RDD with a fresh one, whether or not the information for the RDD already exists. Author: Andrew Or andrewo...@gmail.com Closes #281 from andrewor14/ui-storage-fix and squashes the following commits: 408585a [Andrew Or] Fix storage UI bug commit f5c418da044ef7f3d7185cc5bb1bef79d7f4e25c Author: Michael Armbrust mich...@databricks.com Date: 2014-04-01T21:45:44Z [SQL] SPARK-1372 Support for caching and uncaching tables in a SQLContext. This doesn't yet support different databases in Hive (though you can probably workaround this by calling `USE dbname`). However, given the time constraints for 1.0 I think its probably worth including this now and extending the functionality in the next release. Author
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/234 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions de...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/465 Fix org.scala-lang: * inconsistent versions dependency for maven You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1325 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/465.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #465 commit 42238b6292008d24c96e07f26083b2349f8dd48e Author: witgo wi...@qq.com Date: 2014-04-21T14:54:21Z Fix org.scala-lang: * inconsistent versions for maven commit b434ec083178aef398ce9c2df431652ac7f28d08 Author: witgo wi...@qq.com Date: 2014-04-21T15:31:31Z remove exclusion scalap --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions de...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/465#issuecomment-41006929 @srowen I'm sorry, I submitted a modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1441: Compile Spark Core error with Hado...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/357 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved build configuration
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/480 Improved build configuration You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark format_pom Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/480.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #480 commit 0c6c1fc4c6005def6391beadd55e463cc5b65344 Author: witgo wi...@qq.com Date: 2014-04-22T07:29:53Z Fix compile spark core error with hadoop 0.23.x --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11852400 --- Diff: pom.xml --- @@ -519,6 +519,44 @@ /exclusions /dependency dependency +groupIdorg.apache.avro/groupId --- End diff -- spark-hive dependency: ``` [INFO] +- org.apache.hive:hive-serde:jar:0.12.0:compile [INFO] | +- org.apache.hive:hive-common:jar:0.12.0:compile [INFO] | | +- org.apache.hive:hive-shims:jar:0.12.0:compile [INFO] | | | \- commons-logging:commons-logging-api:jar:1.0.4:compile [INFO] | | +- commons-cli:commons-cli:jar:1.2:compile [INFO] | | \- org.apache.commons:commons-compress:jar:1.4.1:compile [INFO] | | \- org.tukaani:xz:jar:1.0:compile [INFO] | +- org.mockito:mockito-all:jar:1.8.5:test (version managed from 1.8.2; scope managed from compile) [INFO] | +- org.apache.thrift:libfb303:jar:0.9.0:compile [INFO] | | \- org.apache.thrift:libthrift:jar:0.9.0:compile [INFO] | | +- org.apache.httpcomponents:httpclient:jar:4.1.3:compile [INFO] | | \- org.apache.httpcomponents:httpcore:jar:4.1.3:compile [INFO] | +- commons-codec:commons-codec:jar:1.4:compile [INFO] | +- org.apache.avro:avro:jar:1.7.4:compile (version managed from 1.7.1) [INFO] | | \- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | \- org.apache.avro:avro-mapred:jar:1.7.1:compile [INFO] | \- org.apache.avro:avro-ipc:jar:1.7.1:compile [INFO] |+- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] |+- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] |+- org.apache.velocity:velocity:jar:1.7:compile [INFO] |\- org.mortbay.jetty:servlet-api:jar:2.5-20081211:compile ``` spark-streaming-flume dependency: ``` [INFO] +- org.apache.flume:flume-ng-sdk:jar:1.2.0:compile [INFO] | +- org.apache.avro:avro:jar:1.7.4:compile [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.8.8:compile [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | \- org.apache.commons:commons-compress:jar:1.4.1:compile [INFO] | | \- org.tukaani:xz:jar:1.0:compile [INFO] | +- org.apache.avro:avro-ipc:jar:1.6.3:compile [INFO] | | +- org.mortbay.jetty:jetty:jar:6.1.26:compile [INFO] | | +- org.mortbay.jetty:jetty-util:jar:6.1.26:compile [INFO] | | \- org.apache.velocity:velocity:jar:1.7:compile [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile [INFO] | | \- commons-lang:commons-lang:jar:2.4:compile ``` inconsistent versions dependency and here to add only affects the module who dependency avro-ipc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11852495 --- Diff: pom.xml --- @@ -793,6 +831,157 @@ /build profiles +!-- SPARK-1121: Adds an explicit dependency on Avro to work around a Hadoop 0.23.X issue -- +profile --- End diff -- I do not know how to do, an example? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11852713 --- Diff: bagel/pom.xml --- @@ -31,20 +31,6 @@ nameSpark Project Bagel/name urlhttp://spark.apache.org//url - profiles --- End diff -- It exists in almost all modules should be referred to the parent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11852992 --- Diff: examples/pom.xml --- @@ -124,6 +110,10 @@ groupIdcommons-logging/groupId artifactIdcommons-logging/artifactId /exclusion +exclusion --- End diff -- The jar is very big, there are 12 m. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11856833 --- Diff: pom.xml --- @@ -892,10 +1081,11 @@ dependency groupIdorg.apache.zookeeper/groupId artifactIdzookeeper/artifactId + version3.4.5/version --- End diff -- curator-recipes 2.4.0 = zookeeper 3.4.5 hbase 0.94.6 = zookeeper 3.4.5 kafka_2.10 0.8.0 = zookeeper 3.3.4 hadoop is not directly depend on the zookeeper --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP]Improved build configuration
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/480#discussion_r11858762 --- Diff: pom.xml --- @@ -892,10 +1081,11 @@ dependency groupIdorg.apache.zookeeper/groupId artifactIdzookeeper/artifactId + version3.4.5/version --- End diff -- [SPARK-1064](https://issues.apache.org/jira/browse/SPARK-1064),[PR 102](https://github.com/apache/spark/pull/102).There is no equivalent feature in sbt. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: pom.xml modifications added to SparkBuild.scal...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/435 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix org.scala-lang: * inconsistent versions de...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/465 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1500: add with-hive argument to make-dis...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/412 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1119 and other build improvements
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/502#issuecomment-41144046 Why change using Maven build? [The PR 480](https://github.com/apache/spark/pull/480) has some relevant changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Clean up and simplify Spark configuration
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/299#issuecomment-41150159 SPARK DAEMON_OPTS seems to have no effect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Modify spark.ui.killEnabled default is false
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/510 Modify spark.ui.killEnabled default is false You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark killEnabled Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/510.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #510 commit d685783974ef3736d50ea5d16b7b97c8d9bc3b7e Author: witgo wi...@qq.com Date: 2014-04-23T17:06:22Z Modify spark.ui.killEnabled default is false --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1031#issuecomment-52016199 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1022#issuecomment-52138943 YARN does not seem to do any processing. We can use the solution in #1031 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405]Collapsed Gibbs sampling base...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1983 [WIP][SPARK-1405]Collapsed Gibbs sampling based Latent Dirichlet Allocation This PR is based on @yinxusen's #476 You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark cgs_lda Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1983.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1983 commit 5c1ec0aa4a43bfd0cb522aa8f0b71c0aace2 Author: GuoQiang Li wi...@qq.com Date: 2014-08-16T07:23:13Z Collapsed Gibbs sampling based Latent Dirichlet Allocation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2677][SPARK-2717]BasicBlockFetchIterato...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1619 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2701: ConnectionManager throws out of C...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1603 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: In the stop method of ConnectionManager to can...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1989 In the stop method of ConnectionManager to cancel the ackTimeoutMonitor You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark cancel_ackTimeoutMonitor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1989.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1989 commit 4a700fab2ff7fcae3bd9f7ed9d385bb22adf317d Author: GuoQiang Li wi...@qq.com Date: 2014-08-17T01:14:22Z In the stop method of ConnectionManager to cancel the ackTimeoutMonitor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-732: eliminate duplicate update of the a...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/228#issuecomment-52419470 How about record the `(stageId,partitionId)` in `Accumulable `? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2873] [SQL] using ExternalAppendOnlyMap...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1822#issuecomment-52491562 Try this: `git commit -m Big-ass commit --allow-empty` `git rebase -i master`, `git push origin sql-memory-patch -f ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3015] Block on cleaning tasks to preven...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1931#discussion_r16464399 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -66,10 +66,15 @@ private[spark] class ContextCleaner(sc: SparkContext) extends Logging { /** * Whether the cleaning thread will block on cleanup tasks. - * This is set to true only for tests. + * + * Due to SPARK-3015, this is set to true by default. This is intended to be only a temporary + * workaround for the issue, which is ultimately caused by the way the BlockManager actors + * issue inter-dependent blocking Akka messages to each other at high frequencies. This happens, + * for instance, when the driver performs a GC and cleans up all broadcast blocks that are no + * longer in scope. */ private val blockOnCleanupTasks = sc.conf.getBoolean( -spark.cleaner.referenceTracking.blocking, false) --- End diff -- The changes will not solve the problem here. see. [BlockManagerMasterActor.scala#L165](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L165) ```scala private def removeShuffle(shuffleId: Int): Future[Seq[Boolean]] = { // Nothing to do in the BlockManagerMasterActor data structures import context.dispatcher val removeMsg = RemoveShuffle(shuffleId) Future.sequence( blockManagerInfo.values.map { bm = // Here has set the akkaTimeout bm.slaveActor.ask(removeMsg)(akkaTimeout).mapTo[Boolean] }.toSeq ) } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3139] Akka timeouts from ContextCleaner...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2056 [SPARK-3139] Akka timeouts from ContextCleaner when cleaning shuffles You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3139 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2056.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2056 commit a49bc80deac4a4ca1c288e6ed99324f7fb92bc46 Author: GuoQiang Li wi...@qq.com Date: 2014-08-20T08:58:13Z Akka timeouts from ContextCleaner when cleaning shuffles --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52870430 I think this the root cause here [ShuffleBlockManager.scala#L207](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala#L207). `removeShuffleBlocks` remove to all the blocks / files related to a particular shuffle. There should be a lot of IO wait. In our production cluster, through increase the `spark.akka.askTimeout` can solve this problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3169]: Fix make-distribution.sh failed
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2075 [SPARK-3169]: Fix make-distribution.sh failed You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3169 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2075.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2075 commit 5f104d05eef830a4f391ac5537474e01c05ba7dc Author: GuoQiang Li wi...@qq.com Date: 2014-08-21T03:45:15Z Fix make-distribution.sh failed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3124] [SQL] Fix the assembly jar confli...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-52882546 I think we need to modify this file: ` sql/hive-thriftserver/pom.xml` ```xml dependency groupIdorg.spark-project.hive/groupId artifactIdhive-cli/artifactId version${hive.version}/version exclusions exclusion groupIdorg.jboss.netty/groupId artifactIdnetty/artifactId /exclusion /exclusions /dependency ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3124] Fix the jar version conflict in u...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-52884754 `./make-distribution.sh -Pyarn -Phadoop-2.3 -Phive-thriftserver -Phive -Dhadoop.version=2.3.0`. `./bin/spark-sql --hiveconf hive.root.logger=INFO,console` seems to be no problem,under the folder 'dist' --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2798 [BUILD] Correct several small error...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1726#issuecomment-52922473 I tested, but compile failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2798 [BUILD] Correct several small error...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1726#discussion_r16541517 --- Diff: external/flume-sink/pom.xml --- @@ -65,12 +66,9 @@ /exclusions /dependency dependency - groupIdorg.scala-lang/groupId - artifactIdscala-library/artifactId -/dependency -dependency groupIdorg.scalatest/groupId artifactIdscalatest_${scala.binary.version}/artifactId + scopetest/scope --- End diff -- Add the dependency. So `make-distribution.sh` can work. ```xml dependency groupIdorg.apache.spark/groupId artifactIdspark-streaming_${scala.binary.version}/artifactId version${project.version}/version scopetest/scope /dependency ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3098]In some cases, the result of ...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2083 [WIP][SPARK-3098]In some cases, the result of RDD.distinct is inconsistent cc @srowen You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark distinct Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2083.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2083 commit 425c8236d1c4d3b1fb93d2fe4c25d9cba45620fd Author: GuoQiang Li wi...@qq.com Date: 2014-08-20T16:34:27Z The result of RDD.distinct is inconsistent --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3169]: Fix make-distribution.sh failed
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/2075 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-53149146 In `removeShuffleBlocks` ``` for (mapId - state.completedMapTasks; reduceId - 0 until state.numBuckets) { val blockId = new ShuffleBlockId(shuffleId, mapId, reduceId) blockManager.diskBlockManager.getFile(blockId).delete() } ``` To delete a lot of small files is very time-consuming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2481: The environment variables SPARK_HI...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1341#issuecomment-53364537 @andrewor14 done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2482: Resolve sbt warnings during build
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1330#issuecomment-53365696 @andrewor14 , @srowen This is mainly to solve the problem of importing the scala.language.postfixOps and org.scalatest.time.SpanSugar._ at the same time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3139] Akka timeouts from ContextCleaner...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-53546333 @tdas @pwendell We do not need to wait for clear `RDD`,`Broadcast` #2143 does not solve the timeout in [removeShuffle method](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L159) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3139] Akka timeouts from ContextCleaner...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-53664268 Increase the timeout in [removeBroadcast method](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L175) can avoid SPARK-3015 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3273]The spark version in the welcome m...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2175 [SPARK-3273]The spark version in the welcome message of spark-shell is not correct You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3273 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2175.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2175 commit 3873f4c7e9f7d331182f4811cd2bee442963e819 Author: GuoQiang Li wi...@qq.com Date: 2014-08-28T04:00:56Z The spark version in the welcome message of spark-shell is not correct --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-53676540 @rxin could you take a look at this PR? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-53677259 [SPARK-3224](https://issues.apache.org/jira/browse/SPARK-3224) is the same problem. This PR adds some boundary judgments and removed some redundant code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3273]The spark version in the welcome m...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/2175#discussion_r16822912 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoopInit.scala --- @@ -26,7 +26,7 @@ trait SparkILoopInit { __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ - /___/ .__/\_,_/_/ /_/\_\ version 1.0.0-SNAPSHOT + /___/ .__/\_,_/_/ /_/\_\ version 1.1.0-SNAPSHOT --- End diff -- This is a good idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]Collapsed Gibbs sampli...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1983#issuecomment-53700053 @mengxr This patch removed the `accumulable` operation . repair formula errors in `dropOneDistSampler ` method and some of the performance optimization. About how I store model, I have not yet mature ideas. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3273]The spark version in the welcome m...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2175#issuecomment-53830851 @nchammas We should create a separate jira for the python-related issues --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3273]The spark version in the welcome m...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2175#issuecomment-53830862 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3302] The wrong version information in ...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2197 [SPARK-3302] The wrong version information in SparkContext You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3302 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2197.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2197 commit 39d6ecfeedec04069b0bce63560a7fb372d672a0 Author: GuoQiang Li wi...@qq.com Date: 2014-08-29T09:43:05Z The wrong version information in SparkContext --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3302] The wrong version information in ...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/2197 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]Collapsed Gibbs sampli...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1983#issuecomment-54065708 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3098]In some cases, the result of ...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/2083 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2484: Build should not run hive tests by...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1565#issuecomment-54252174 OK, I close it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2484: Build should not run hive tests by...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1565 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3301]The spark version in the welcome m...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/2196 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3301]The spark version in the welcome m...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2196#issuecomment-54267918 @ScrapCodes @nchammas I merge this PR into #2175 and close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1022#issuecomment-54268060 I merge this PR into #1031 and close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1022 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor]Remove extra semicolon in FlumeStreamSu...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2265 [Minor]Remove extra semicolon in FlumeStreamSuite.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark FlumeStreamSuite Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2265.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2265 commit 6c99e6e133ca2e0872aa5841ab6fb30009aa58bd Author: GuoQiang Li wi...@qq.com Date: 2014-09-04T01:50:36Z Remove extra semicolon in FlumeStreamSuite.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3397] Bump pom.xml version number of ma...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2268 [SPARK-3397] Bump pom.xml version number of master branch to 1.2.0-SNAPSHOT You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3397 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2268.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2268 commit eaf913f19ca86663085779b63430b1d21b553585 Author: GuoQiang Li wi...@qq.com Date: 2014-09-04T04:06:31Z Bump pom.xml version number of master branch to 1.2.0-SNAPSHOT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3397] Bump pom.xml version number of ma...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2268#issuecomment-54441310 @srowen I agree with you. But [SparkContext.SPARK_VERSION](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1300) has been modified to `1.2.0-SNAPSHOT` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3124] Fix the jar version conflict in u...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-54443276 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3139] Akka timeouts from ContextCleaner...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/2056 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-2167] spark-submit should return e...
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1788 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3293] yarn's web show SUCCEEDED when ...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2311 [SPARK-3293] yarn's web show SUCCEEDED when the driver throw a exception in yarn-client You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3293 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2311.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2311 commit 3828707a9b67f0b54a8f6a0a9b36307c2ae14429 Author: GuoQiang Li wi...@qq.com Date: 2014-09-03T06:00:52Z yarn check exit code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2491] Don't handle uncaught exceptions ...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-54917849 @aarondav I understand what you mean,I will submit the relevant code tomorrow. BTW,most of the OOM are present in deserialization process. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] rat exclude dependency-reduced-pom.xml
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2326 [Minor] rat exclude dependency-reduced-pom.xml You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark rat-excludes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2326.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2326 commit 860904e96c7a4e06adc80e36163891f9b6f9175d Author: GuoQiang Li wi...@qq.com Date: 2014-09-09T03:09:32Z rat exclude dependency-reduced-pom.xml --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/1877#discussion_r17302741 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1046,41 +1046,37 @@ class DAGScheduler( case FetchFailed(bmAddress, shuffleId, mapId, reduceId) = val failedStage = stageIdToStage(task.stageId) -val mapStage = shuffleToMapStage(shuffleId) // It is likely that we receive multiple FetchFailed for a single stage (because we have // multiple tasks running concurrently on different executors). In that case, it is possible // the fetch failure has already been handled by the scheduler. -if (runningStages.contains(failedStage)) { +if (runningStages.contains(failedStage) stage.pendingTasks.contains(task)) { --- End diff -- @rxin Because there is no cancel running tasks in the stage. `stage.pendingTasks.contains(task)` is necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-54976720 screenshots: ![qq20140909-1](https://cloud.githubusercontent.com/assets/302879/4203071/131c5292-382d-11e4-88d3-6d9bb50a8389.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org