[GitHub] spark issue #23121: [SPARK-24553][UI][FOLLOWUP][2.4 Backport] Fix unnecessar...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/23121 OK, thanks for the reminder. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23121: [SPARK-24553][UI][FOLLOWUP][2.4 Backport] Fix unn...
Github user jerryshao closed the pull request at: https://github.com/apache/spark/pull/23121 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23121: [SPARK-24553][UI][FOLLOWUP][2.4 Backport] Fix unn...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/23121 [SPARK-24553][UI][FOLLOWUP][2.4 Backport] Fix unnecessary UI redirect ## What changes were proposed in this pull request? This is a backport PR of #23116 . This PR is a follow-up PR of #21600 to fix the unnecessary UI redirect. ## How was this patch tested? Local verification You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-24553-branch-2.4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23121.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23121 commit c6351f68b4e24834fde503c8d068d2e6d3966348 Author: jerryshao Date: 2018-11-22T22:54:00Z [SPARK-24553][UI][FOLLOWUP] Fix unnecessary UI redirect ## What changes were proposed in this pull request? This PR is a follow-up PR of #21600 to fix the unnecessary UI redirect. ## How was this patch tested? Local verification Closes #23116 from jerryshao/SPARK-24553. Authored-by: jerryshao Signed-off-by: Dongjoon Hyun (cherry picked from commit 76aae7f1fd512f150ffcdb618107b12e1e97fe43) Signed-off-by: jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23116: [SPARK-24553][UI][FOLLOWUP] Fix unnecessary UI redirect
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/23116 @dongjoon-hyun , this should also be backported to branch 2.4, let me create a backport PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23116: [SPARK-24553][UI][FOLLOWUP] Fix unnecessary UI redirect
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/23116 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23116: [SPARK-24553][UI][FOLLOWUP] Fix unnecessary UI re...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/23116 [SPARK-24553][UI][FOLLOWUP] Fix unnecessary UI redirect ## What changes were proposed in this pull request? This PR is a follow-up PR of #21600 to fix the unnecessary UI redirect. ## How was this patch tested? Local verification You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-24553 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23116.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23116 commit dcd517daaee7610cf4b75bd118baf24ba6bf40ae Author: jerryshao Date: 2018-11-22T06:43:28Z Fix unnessary UI redirect --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22441: [SPARK-25445][BUILD] the release script should be able t...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22441 Is it possible to test this on Jenkins? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22434: [SPARK-24685][BUILD][FOLLOWUP] Fix the nonexist profile ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22434 @cloud-fan @vanzin , please help to review, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22434: [SPARK-24685][BUILD][FOLLOWUP] Fix the nonexist p...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/22434 [SPARK-24685][BUILD][FOLLOWUP] Fix the nonexist profile name in release script ## What changes were proposed in this pull request? `without-hadoop` profile doesn't exist in Maven, instead the name should be `hadoop-provided`, this is a regression introduced by SPARK-24685. So here fix it. ## How was this patch tested? Local test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-24685-followup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22434 commit 18a91354abdf793a569a84046f3bf2016b2ccd03 Author: jerryshao Date: 2018-09-16T12:29:01Z Fix the nonexist profile name in release script --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Btw, I don't think we can run current Spark with Hadoop 3.1 without any change. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Do we officially support hadoop3 in branch 2.4? If branch 2.4 doesn't target to support Hadoop3 and this fix is only for Hadoop3, then I don't think it is meaningful to have this fix. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22372: [SPARK-25385][BUILD] Upgrade Hadoop 3.1 jackson version ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22372 Jackson version below 2.9.5 has CVE issues, I would suggest to upgrade to 2.9.6 as #21596 did. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18142 I see. Thanks for the note. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21756 I think the use case here is quite specific, I'm not sure if it is a good idea to make `SparkHadoopUtil` ServiceLoader-able to support your requirement. Typically I don't think user has a such requirement to build their own `SparkHadoopUtil`. I'm wondering do we have other solutions or workarounds to support your use case? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22299: [SPARK-24748][SS][FOLLOWUP] Switch custom metrics to Uns...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22299 Seems there's another similar PR #22296 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL] Spark HiveServer2 to use Spark Shutdo...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 Merging to master branch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22213: [SPARK-25221][DEPLOY] Consistent trailing whitesp...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22213#discussion_r214244665 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -1144,6 +1144,46 @@ class SparkSubmitSuite conf1.get(PY_FILES.key) should be (s"s3a://${pyFile.getAbsolutePath}") conf1.get("spark.submit.pyFiles") should (startWith("/")) } + + test("handles natural line delimiters in --properties-file and --conf uniformly") { +val delimKey = "spark.my.delimiter." +val LF = "\n" +val CR = "\r" + +val leadingDelimKeyFromFile = s"${delimKey}leadingDelimKeyFromFile" -> s"${LF}blah" +val trailingDelimKeyFromFile = s"${delimKey}trailingDelimKeyFromFile" -> s"blah${CR}" +val infixDelimFromFile = s"${delimKey}infixDelimFromFile" -> s"${CR}blah${LF}" +val nonDelimSpaceFromFile = s"${delimKey}nonDelimSpaceFromFile" -> " blah\f" --- End diff -- Sorry for the stupid question. I guess I was thinking of something different. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL] Spark HiveServer2 to use Spark Shutdo...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL] Spark HiveServer2 to use Spark Shutdo...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 I see. Thanks for the explain, I checked the code again, yes you're right. Let me retrigger the test again, will merge it if everything is fine. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22279: [SPARK-25277][YARN] YARN applicationMaster metric...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22279#discussion_r214234325 --- Diff: core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala --- @@ -103,6 +103,14 @@ private[spark] class MetricsSystem private ( sinks.foreach(_.start) } + // Same as start but this method only registers sinks --- End diff -- Would you please explain why only registering sinks could solve the problem here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22279: [SPARK-25277][YARN] YARN applicationMaster metrics shoul...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22279 Hi @LucaCanali do you have an output current AM metrics? I would like to know what kind of metrics will be output for now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22289: [SPARK-25200][YARN] Allow specifying HADOOP_CONF_...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22289#discussion_r214233802 --- Diff: launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java --- @@ -200,6 +200,7 @@ void addOptionString(List cmd, String options) { addToClassPath(cp, getenv("HADOOP_CONF_DIR")); addToClassPath(cp, getenv("YARN_CONF_DIR")); +addToClassPath(cp, getEffectiveConfig().get("spark.yarn.conf.dir")); --- End diff -- I'm wondering how do we update the classpath to change to another hadoop confs with InProcessLauncher? Seems the classpath here is not changeable after JVM is launched. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22213: [SPARK-25221][DEPLOY] Consistent trailing whitesp...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22213#discussion_r214231103 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -1144,6 +1144,46 @@ class SparkSubmitSuite conf1.get(PY_FILES.key) should be (s"s3a://${pyFile.getAbsolutePath}") conf1.get("spark.submit.pyFiles") should (startWith("/")) } + + test("handles natural line delimiters in --properties-file and --conf uniformly") { +val delimKey = "spark.my.delimiter." +val LF = "\n" +val CR = "\r" + +val leadingDelimKeyFromFile = s"${delimKey}leadingDelimKeyFromFile" -> s"${LF}blah" +val trailingDelimKeyFromFile = s"${delimKey}trailingDelimKeyFromFile" -> s"blah${CR}" +val infixDelimFromFile = s"${delimKey}infixDelimFromFile" -> s"${CR}blah${LF}" +val nonDelimSpaceFromFile = s"${delimKey}nonDelimSpaceFromFile" -> " blah\f" --- End diff -- @gerashegalov , I'm not sure how do we manually add LF to the end of line using editor to edit property file? Here in your test, it is the program code to explicitly mimic the case, but I don't think in a real scenario, how do we manually update the property file with additional LF or CR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 The fix itself LGTM, but I don't think this could solve the STS shutdown hook conflict problem with Hadoop. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22164: [SPARK-23679][YARN] Setting RM_HA_URLS for AmIpFilter to...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22164 Thanks @vanzin . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22164: [SPARK-23679][YARN] Setting RM_HA_URLS for AmIpFi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22164#discussion_r213168025 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnRMClient.scala --- @@ -126,4 +136,21 @@ private[spark] class YarnRMClient extends Logging { } } + private def getUrlByRmId(conf: Configuration, rmId: String): String = { --- End diff -- For the Spark usage, I think it may not be so useful to use `AmFilterInitializer`, because we need to pass the filter parameters to driver either from RPC (client mode) or from configuration (cluster mode), in either way we should know how to set each parameter, so from my understanding using `AmFilterInitializer` seems not so useful. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22213: [SPARK-25221][DEPLOY] Consistent trailing whitesp...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22213#discussion_r213160007 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2062,8 +2062,10 @@ private[spark] object Utils extends Logging { try { val properties = new Properties() properties.load(inReader) - properties.stringPropertyNames().asScala.map( -k => (k, properties.getProperty(k).trim)).toMap + properties.stringPropertyNames().asScala +.map(k => (k, properties.getProperty(k))) --- End diff -- >trim removes leading spaces as well that are totally legit. It is hard to say which solution is legit, the way you proposed may be valid in your case, but it will be unexpected in other user's case. I'm not talking about legit or not, what I'm trying to say is that your proposal will break the convention, that's what I concerned about. By ASCII I'm you can pass in ASCII number, and translate to actual char in the code, that will mitigate the problem here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22213: [SPARK-25221][DEPLOY] Consistent trailing whitesp...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22213#discussion_r212889779 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2062,8 +2062,10 @@ private[spark] object Utils extends Logging { try { val properties = new Properties() properties.load(inReader) - properties.stringPropertyNames().asScala.map( -k => (k, properties.getProperty(k).trim)).toMap + properties.stringPropertyNames().asScala +.map(k => (k, properties.getProperty(k))) --- End diff -- The changes here will break the current assumptions. Some editors will leave the trailing WS without removing it, but in fact user doesn't need that trailing WS, the changes here will break the assumptions, user have to check and remove all the trailing WS to avoid unexpected things. AFAIK in Hive usually it uses ASCII or others to specify the separator, not "\n" or "\r\n", which will be removed or converted during the parse (which is quite brittle). So this is more like things you could fix in your side, not necessary in Spark side. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22164: [SPARK-23679][YARN] Setting RM_HA_URLS for AmIpFilter to...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22164 I think it should be related to this JIRA (https://issues.apache.org/jira/browse/YARN-7269). Seems like a Hadoop 2.9/3.0+ issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22186 My local maven build also failed. I think the problem is that`ShutdownHookManager` is implemented in Scala, the complied method signature may be different when invoked from Java, I'm not sure how Scala anonymous function is translated to Java, but it seems like due to this issue. (Maven has some detailed failure information, whereas SBT doesn't have anything). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22199: [SPARK-25073][Yarn] AM and Executor Memory validation me...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22199 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22213: [SPARK-25221][DEPLOY] Consistent trailing whitesp...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22213#discussion_r212530383 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2062,8 +2062,10 @@ private[spark] object Utils extends Logging { try { val properties = new Properties() properties.load(inReader) - properties.stringPropertyNames().asScala.map( -k => (k, properties.getProperty(k).trim)).toMap + properties.stringPropertyNames().asScala +.map(k => (k, properties.getProperty(k))) --- End diff -- @gerashegalov would you please elaborate the use case here? I saw that you're treating `\n` as a property value, what is the specific usage scenario here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22164: [SPARK-23679][YARN] Fix AmIpFilter cannot work in RM HA ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22164 Gently ping again @vanzin @tgravescs . Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22164: [SPARK-23679][YARN] Fix AmIpFilter cannot work in RM HA ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22164 @vanzin @tgravescs would you please help to review, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22164: [SPARK-23679][YARN] Fix AmIpFilter cannot work in...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/22164 [SPARK-23679][YARN] Fix AmIpFilter cannot work in RM HA scenario ## What changes were proposed in this pull request? YARN `AmIpFilter` adds a new parameter "RM_HA_URLS" to support RM HA, but Spark on YARN doesn't provide a such parameter, so it will be failed to redirect when running on RM HA. The detailed exception can be checked from JIRA. So here fixing this issue by adding "RM_HA_URLS" parameter. ## How was this patch tested? Local verification. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-23679 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22164.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22164 commit da33554cc38d4b41e86dcb6e2c833f5b29c35ad8 Author: jerryshao Date: 2018-08-20T08:28:13Z Fix AmIpFilter cannot work in RM HA scenario --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22117: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22117 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22084: [SPARK-25026][BUILD] Binary releases should contain some...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22084 I'm totally on the user's standpoint, compared to ship these slim jars, it would be better to ship the assembly jars, as those jars can be used directly by adding to Spark's runtime. For these slim jars, it will still require additional third-party jars to make it work even if we add to classpath. Shipping these slim jars will also bring in some questions by users as how to leverage those jars. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22084: [SPARK-25026][BUILD] Binary releases should contain some...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22084 From user's point, I don't think it is useful compared to pulling from maven repo, the provided jar alone is of no use. But if there's an Apache policy to release all the binaries, then I'm OK with it. I think this is a behavior change, maybe we should target it as 3.0. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22005: [SPARK-16817][CORE][WIP] Use Alluxio to improve stabilit...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22005 I believe such kind of PR requires SPIP and community discussion first. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22077: [SPARK-25084][SQL][BACKPORT-2.3] "distribute by" on mult...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22077 This is already merged, @LantaoJin please close this PR, thanksï¼ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22084: [SPARK-25026][BUILD] Binary releases should conta...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22084#discussion_r209481255 --- Diff: dev/make-distribution.sh --- @@ -188,6 +190,23 @@ if [ -f "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar cp "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar "$DISTDIR/yarn" fi +# Only copy external jars if built +if [ -f "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" +fi +if [ -f "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" --- End diff -- When building such external jar, assembly jar will also be built accordingly. And the assembly jar can be used directly. Jars provided here still not so useful because it lacks third-party dependencies like Kafka, so I'm not sure if it is more convenient compared to pull from maven repo directly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22084: [SPARK-25026][BUILD] Binary releases should conta...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22084#discussion_r209480817 --- Diff: dev/make-distribution.sh --- @@ -188,6 +190,23 @@ if [ -f "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar cp "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar "$DISTDIR/yarn" fi +# Only copy external jars if built +if [ -f "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" +fi +if [ -f "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" --- End diff -- Also what about kinesis? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22084: [SPARK-25026][BUILD] Binary releases should conta...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22084#discussion_r209480628 --- Diff: dev/make-distribution.sh --- @@ -188,6 +190,23 @@ if [ -f "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar cp "$SPARK_HOME"/common/network-yarn/target/scala*/spark-*-yarn-shuffle.jar "$DISTDIR/yarn" fi +# Only copy external jars if built +if [ -f "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/avro/target/spark-avro_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" +fi +if [ -f "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar ]; then + cp "$SPARK_HOME"/external/kafka-0-10/target/spark-streaming-kafka-0-10_${SCALA_VERSION}-${VERSION}.jar "$DISTDIR/external/jars/" --- End diff -- Shall we also copy assembly jar for Kafka and flume? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22066: [SPARK-25084][SQL] "distribute by" on multiple columns (...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22066 @cloud-fan , yeah, I will include it in 2.3.2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22067 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22055: [MINOR][BUILD] Update Jetty to 9.3.24.v20180605
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22055 Yes @dongjoon-hyun , I will prepare the new RC, nothing is blocked AFAIK. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21977 What about R, do we also need a similar setting for R? I was thinking that with project hydrogen, more and more external processes will be run inside the Spark's executor (MPP), all these external processes require additional memory, can we make it more general? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22034: [SPARK-25054][CORE] Enable MetricsServlet sink fo...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22034#discussion_r208786793 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -169,6 +171,19 @@ private[spark] class Executor( startDriverHeartbeater() + /** + * We add an empty WebUI in executor to enable Executor MetricsServlet sink if needed. + */ + private val executorWebUiEnabled = conf.getBoolean("spark.executor.ui.enabled", false) + private[executor] var webUi: ExecutorWebUI = _ + if (executorWebUiEnabled && !isLocal) { + webUi = new ExecutorWebUI(conf, env.securityManager, SparkUI.getUIPort(conf)) + webUi.bind() + env.metricsSystem.getServletHandlers.foreach(webUi.attachHandler) + heartbeatReceiverRef.ask[Boolean](ReportExecutorWebUrl(executorId, webUi.webUrl)) + logInfo(s"Starting executor web ui at ${webUi.webUrl}") --- End diff -- Maybe you can use jmx or some other tools. Basically the metrics is still report in every N seconds, so pulling frequently doesn't help increasing the accuracy. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22034: [SPARK-25054][CORE] Enable MetricsServlet sink fo...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/22034#discussion_r208552871 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -169,6 +171,19 @@ private[spark] class Executor( startDriverHeartbeater() + /** + * We add an empty WebUI in executor to enable Executor MetricsServlet sink if needed. + */ + private val executorWebUiEnabled = conf.getBoolean("spark.executor.ui.enabled", false) + private[executor] var webUi: ExecutorWebUI = _ + if (executorWebUiEnabled && !isLocal) { + webUi = new ExecutorWebUI(conf, env.securityManager, SparkUI.getUIPort(conf)) + webUi.bind() + env.metricsSystem.getServletHandlers.foreach(webUi.attachHandler) + heartbeatReceiverRef.ask[Boolean](ReportExecutorWebUrl(executorId, webUi.webUrl)) + logInfo(s"Starting executor web ui at ${webUi.webUrl}") --- End diff -- It is too overkill to start an jetty server on each executor only for metrics. I believe you have many different ways to collect executor metrics other than servlet. Besides, to get executor URL you add a new heartbeat message, basically I think it is too overkill. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22022 Merged to branch 2.2, please close this PR @mgaido91 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22022 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22022: [SPARK-24948][SHS][BACKPORT-2.2] Delegate check access p...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22022 Sorry, let me test again to see everything is ok. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22021 @mgaido91 already merged to branch 2.3, please close this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS][BACKPORT-2.3] Delegate check access p...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22021 merging to branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22021: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22021 Please change the title to add branch 2.3 backport tag. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21596 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21596 Are we still waiting for the 2.4 code freeze @gatorsmile @Fokko ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 This should also be backported to branch 2.2 and 2.3 @mridulm , this is a regression. @mgaido91 would you please create backport PRs for the separate branches? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 Hi @mgaido91 would you please check it is auto-mergeable to branch 2.2/2.3, if not please also repare the fix for the related branch once this is merged. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 Ping @mridulm , would you please also take a review, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21953: [SPARK-24992][Core] spark should randomize yarn local di...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21953 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21953: [SPARK-24992][Core] spark should randomize yarn local di...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21953 I see, thanks for explaining. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r207419217 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -80,8 +80,8 @@ import org.apache.spark.util.kvstore._ * break. Simple streaming of JSON-formatted events, as is implemented today, implicitly * maintains this invariant. */ -private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) - extends ApplicationHistoryProvider with Logging { +private[history] class FsHistoryProvider(conf: SparkConf, protected val clock: Clock) + extends ApplicationHistoryProvider with LogFilesBlacklisting with Logging { --- End diff -- This seems not so necessary, let's inline this trait. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r207140685 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -973,6 +985,38 @@ private[history] object FsHistoryProvider { private[history] val CURRENT_LISTING_VERSION = 1L } +/** + * Manages a blacklist containing the files which cannot be read due to lack of access permissions. + */ +private[history] trait LogFilesBlacklisting extends Logging { + protected def clock: Clock + + /** + * Contains the name of blacklisted files and their insertion time. + */ + private val blacklist = new ConcurrentHashMap[String, Long] + + private[history] def isBlacklisted(path: Path): Boolean = { +blacklist.containsKey(path.getName) + } + + private[history] def blacklist(path: Path): Unit = { +blacklist.put(path.getName, clock.getTimeMillis()) + } + + /** + * Removes expired entries in the blacklist, according to the provided `expireTimeInSeconds`. + */ + protected def clearBlacklist(expireTimeInSeconds: Long): Unit = { +val expiredThreshold = clock.getTimeMillis() - expireTimeInSeconds * 1000 +val expired = new mutable.ArrayBuffer[String] +blacklist.asScala.foreach { --- End diff -- AFAIK, `asScala` doesn't copy and create a snapshot from original map, it just wraps the original map and provide Scala API. The change of original map will also affect the object after `asScala`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21953: [SPARK-24992][Core] spark should randomize yarn local di...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21953 What kind of behavior did you see? This local dir is only used to store some temporary files, which is not IO intensive, so I don't think the problem here is severe. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 @mridulm would you please also take a review. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r207133206 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -973,6 +985,38 @@ private[history] object FsHistoryProvider { private[history] val CURRENT_LISTING_VERSION = 1L } +/** + * Manages a blacklist containing the files which cannot be read due to lack of access permissions. + */ +private[history] trait LogFilesBlacklisting extends Logging { + protected def clock: Clock + + /** + * Contains the name of blacklisted files and their insertion time. + */ + private val blacklist = new ConcurrentHashMap[String, Long] + + private[history] def isBlacklisted(path: Path): Boolean = { +blacklist.containsKey(path.getName) + } + + private[history] def blacklist(path: Path): Unit = { +blacklist.put(path.getName, clock.getTimeMillis()) + } + + /** + * Removes expired entries in the blacklist, according to the provided `expireTimeInSeconds`. + */ + protected def clearBlacklist(expireTimeInSeconds: Long): Unit = { +val expiredThreshold = clock.getTimeMillis() - expireTimeInSeconds * 1000 +val expired = new mutable.ArrayBuffer[String] +blacklist.asScala.foreach { --- End diff -- Ideally the iteration should be synchronized, but I think it is not a big deal here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r207131160 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -461,32 +462,37 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) logDebug(s"New/updated attempts found: ${updated.size} ${updated.map(_.getPath)}") } - val tasks = updated.map { entry => + val tasks = updated.flatMap { entry => try { - replayExecutor.submit(new Runnable { + val task: Future[Unit] = replayExecutor.submit(new Runnable { override def run(): Unit = mergeApplicationListing(entry, newLastScanTime, true) - }) + }, Unit) + Some(task -> entry.getPath) } catch { // let the iteration over the updated entries break, since an exception on // replayExecutor.submit (..) indicates the ExecutorService is unable // to take any more submissions at this time case e: Exception => logError(s"Exception while submitting event log for replay", e) -null +None } - }.filter(_ != null) + } pendingReplayTasksCount.addAndGet(tasks.size) // Wait for all tasks to finish. This makes sure that checkForLogs // is not scheduled again while some tasks are already running in // the replayExecutor. - tasks.foreach { task => + tasks.foreach { case (task, path) => try { task.get() } catch { case e: InterruptedException => throw e + case e: ExecutionException if e.getCause.isInstanceOf[AccessControlException] => +// We don't have read permissions on the log file +logDebug(s"Unable to read log $path", e.getCause) --- End diff -- I would suggest to use warning log for the first time we met such issue, to notify user that some event logs cannot be read. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r207128637 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -80,8 +80,8 @@ import org.apache.spark.util.kvstore._ * break. Simple streaming of JSON-formatted events, as is implemented today, implicitly * maintains this invariant. */ -private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) - extends ApplicationHistoryProvider with Logging { +private[history] class FsHistoryProvider(conf: SparkConf, protected val clock: Clock) + extends ApplicationHistoryProvider with LogFilesBlacklisting with Logging { --- End diff -- What is the special advantage of using a mixin trait rather than directly changing the code here in `FsHistoryProvider`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 I don't think the problem you mentioned is a big problem. 1. For the blacklist mechanism, we can have a time-based reviving mechanism to check if permission is changed, compared to check file permission for all the files, the cost would not be so high. Also as you mentioned, the permission is seldom changed, so it is fine without change. 2. I don't think this is a problem, try-catch with proper log should be enough. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21895 My current thinking is to revert SPARK-20172 and improve the logging when exception is met during the actual read. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r206726059 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -973,6 +978,42 @@ private[history] object FsHistoryProvider { private[history] val CURRENT_LISTING_VERSION = 1L } +private[history] trait CachedFileSystemHelper extends Logging { + protected def fs: FileSystem + protected def expireTimeInSeconds: Long + + /** + * LRU cache containing the result for the already checked files. + */ + // Visible for testing. + private[history] val cache = CacheBuilder.newBuilder() +.expireAfterAccess(expireTimeInSeconds, TimeUnit.SECONDS) +.build[String, java.lang.Boolean]() --- End diff -- In the real word, there will be many event logs under the folder, this will lead to memory increase indefinitely and potentially lead to OOM. We have seen that customer has more than 100K event logs in this folder. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21895: [SPARK-24948][SHS] Delegate check access permissi...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21895#discussion_r206725814 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -973,6 +978,42 @@ private[history] object FsHistoryProvider { private[history] val CURRENT_LISTING_VERSION = 1L } +private[history] trait CachedFileSystemHelper extends Logging { --- End diff -- As discussed offline, my main concern is about cache inconsistency if user changed the file permission during cache valid time. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21867: [SPARK-24307][CORE] Add conf to revert to old cod...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21867#discussion_r205312971 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -731,7 +731,14 @@ private[spark] class BlockManager( } if (data != null) { -return Some(ChunkedByteBuffer.fromManagedBuffer(data, chunkSize)) +// SPARK-24307 undocumented "escape-hatch" in case there are any issues in converting to +// to ChunkedByteBuffer, to go back to old code-path. Can be removed post Spark 2.4 if +// new path is stable. +if (conf.getBoolean("spark.fetchToNioBuffer", false)) { --- End diff -- Maybe we'd better to rename that one "spark.maxRemoteBlockSizeFetchToMem" also ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21474: [SPARK-24297][CORE] Fetch-to-disk by default for > 2gb
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21474 Hi @squito , would you please also update the changes in the doc, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r204265925 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala --- @@ -152,6 +152,11 @@ package object config { .timeConf(TimeUnit.MILLISECONDS) .createWithDefaultString("100s") + private[spark] val YARN_METRICS_NAMESPACE = ConfigBuilder("spark.yarn.metrics.namespace") --- End diff -- Can you please add this configuration to the yarn doc? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21758 I see, thanks for explaining. Maybe it is worth to mark as a TODO. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21758 I mean rddC's partitions are derived from rddA and rddB, here assuming partitions in rddA requires barrier, but not required in rddB. So rddC's partitions are the half barrier and half not barrier. So how do you merge such conflict, did you mark rddC's all partitions as barrier, or only the ones coming from rddA? Obviously partitions from rddB doesn't require barrier (just normal tasks), forcing to barrier will require strict resource demand. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21474: [SPARK-24297][CORE] Fetch-to-disk by default for ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21474#discussion_r204021872 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -429,7 +429,11 @@ package object config { "external shuffle service, this feature can only be worked when external shuffle" + "service is newer than Spark 2.2.") .bytesConf(ByteUnit.BYTE) - .createWithDefault(Long.MaxValue) --- End diff -- I think the original purpose to set to `Long.MaxValue` is to avoid using this configuration by default, user should set to a proper size to enable this feature. But anyway I think the current change is also fine. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21758 Hi @jiangxb1987 one question about barrier task. For example, `rddA` is marked as barrier, and `rddB` is a normal RDD, if `rddC = rddA.union(rddB)`, seems it contains both normal task and barrier, will you try to mark tasks generated from `rddB` also as barrier task, or you will only mark tasks from 'rddA' as barrier tasks? This will potentially affect the resource demands, as `rddA` requires gang semantics, whereas `rddB` doesn't require that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21474: [SPARK-24297][CORE] Fetch-to-disk by default for > 2gb
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21474 I will take a look at this sometime day, but don't block on me if it is urgent. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Ignore the files with "local" scheme...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21533 Merging to master branch. Thanks all! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203914040 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -659,6 +659,11 @@ private[spark] class BlockManager( * Get block from remote block managers as serialized bytes. */ def getRemoteBytes(blockId: BlockId): Option[ChunkedByteBuffer] = { +// TODO if we change this method to return the ManagedBuffer, then getRemoteValues +// could just use the inputStream on the temp file, rather than memory-mapping the file. +// Until then, replication can cause the process to use too much memory and get killed +// by the OS / cluster manager (not a java OOM, since its a memory-mapped file) even though +// we've read the data to disk. --- End diff -- I see. I agree with you that YARN could have some issues in calculating the exact memory usage. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203581903 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -659,6 +659,11 @@ private[spark] class BlockManager( * Get block from remote block managers as serialized bytes. */ def getRemoteBytes(blockId: BlockId): Option[ChunkedByteBuffer] = { +// TODO if we change this method to return the ManagedBuffer, then getRemoteValues +// could just use the inputStream on the temp file, rather than memory-mapping the file. +// Until then, replication can cause the process to use too much memory and get killed +// by the OS / cluster manager (not a java OOM, since its a memory-mapped file) even though +// we've read the data to disk. --- End diff -- > not a java OOM, since its a memory-mapped file I'm not sure why memory-mapped file will cause too much memory? AFAIK memory mapping is a lazy loading mechanism in page-wise, system will only load the to-be-accessed file segment to memory page, not the whole file to memory. So from my understanding even very small physical memory could map a super large file. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r203580155 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMasterSource.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.yarn + +import com.codahale.metrics.{Gauge, MetricRegistry} + +import org.apache.spark.metrics.source.Source + +private[spark] class ApplicationMasterSource(yarnAllocator: YarnAllocator) extends Source { + + override val sourceName: String = "applicationMaster" --- End diff -- I see. But I think we may not get "spark.app.id" in AM side, instead I think we can get yarn application id, so either we can set this configuration with application id, or directly prepend to the source name. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203251175 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBufferFileRegion.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util.io + +import java.nio.channels.WritableByteChannel + +import io.netty.channel.FileRegion +import io.netty.util.AbstractReferenceCounted + +import org.apache.spark.internal.Logging +import org.apache.spark.network.util.AbstractFileRegion + + +/** + * This exposes a ChunkedByteBuffer as a netty FileRegion, just to allow sending > 2gb in one netty + * message. This is because netty cannot send a ByteBuf > 2g, but it can send a large FileRegion, + * even though the data is not backed by a file. + */ +private[io] class ChunkedByteBufferFileRegion( +private val chunkedByteBuffer: ChunkedByteBuffer, +private val ioChunkSize: Int) extends AbstractFileRegion { + + private var _transferred: Long = 0 + // this duplicates the original chunks, so we're free to modify the position, limit, etc. + private val chunks = chunkedByteBuffer.getChunks() + private val size = chunks.foldLeft(0L) { _ + _.remaining() } + + protected def deallocate: Unit = {} + + override def count(): Long = size + + // this is the "start position" of the overall Data in the backing file, not our current position + override def position(): Long = 0 + + override def transferred(): Long = _transferred + + private var currentChunkIdx = 0 + + def transferTo(target: WritableByteChannel, position: Long): Long = { +assert(position == _transferred) +if (position == size) return 0L +var keepGoing = true +var written = 0L +var currentChunk = chunks(currentChunkIdx) +while (keepGoing) { + while (currentChunk.hasRemaining && keepGoing) { +val ioSize = Math.min(currentChunk.remaining(), ioChunkSize) +val originalLimit = currentChunk.limit() +currentChunk.limit(currentChunk.position() + ioSize) +val thisWriteSize = target.write(currentChunk) +currentChunk.limit(originalLimit) +written += thisWriteSize +if (thisWriteSize < ioSize) { --- End diff -- I see, thanks for explain. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203250619 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -166,6 +170,34 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { } +object ChunkedByteBuffer { + // TODO eliminate this method if we switch BlockManager to getting InputStreams + def fromManagedBuffer(data: ManagedBuffer, maxChunkSize: Int): ChunkedByteBuffer = { +data match { + case f: FileSegmentManagedBuffer => +map(f.getFile, maxChunkSize, f.getOffset, f.getLength) + case other => +new ChunkedByteBuffer(other.nioByteBuffer()) +} + } + + def map(file: File, maxChunkSize: Int, offset: Long, length: Long): ChunkedByteBuffer = { +Utils.tryWithResource(new FileInputStream(file).getChannel()) { channel => --- End diff -- I've already updated some of them in SPARK-21475 in shuffle related code path, but not all of them which are not so critical. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203237484 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBufferFileRegion.scala --- @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util.io + +import java.nio.channels.WritableByteChannel + +import io.netty.channel.FileRegion +import io.netty.util.AbstractReferenceCounted + +import org.apache.spark.internal.Logging +import org.apache.spark.network.util.AbstractFileRegion + + +/** + * This exposes a ChunkedByteBuffer as a netty FileRegion, just to allow sending > 2gb in one netty + * message. This is because netty cannot send a ByteBuf > 2g, but it can send a large FileRegion, + * even though the data is not backed by a file. + */ +private[io] class ChunkedByteBufferFileRegion( +private val chunkedByteBuffer: ChunkedByteBuffer, +private val ioChunkSize: Int) extends AbstractFileRegion { + + private var _transferred: Long = 0 + // this duplicates the original chunks, so we're free to modify the position, limit, etc. + private val chunks = chunkedByteBuffer.getChunks() + private val size = chunks.foldLeft(0L) { _ + _.remaining() } + + protected def deallocate: Unit = {} + + override def count(): Long = size + + // this is the "start position" of the overall Data in the backing file, not our current position + override def position(): Long = 0 + + override def transferred(): Long = _transferred + + private var currentChunkIdx = 0 + + def transferTo(target: WritableByteChannel, position: Long): Long = { +assert(position == _transferred) +if (position == size) return 0L +var keepGoing = true +var written = 0L +var currentChunk = chunks(currentChunkIdx) +while (keepGoing) { + while (currentChunk.hasRemaining && keepGoing) { +val ioSize = Math.min(currentChunk.remaining(), ioChunkSize) +val originalLimit = currentChunk.limit() +currentChunk.limit(currentChunk.position() + ioSize) +val thisWriteSize = target.write(currentChunk) +currentChunk.limit(originalLimit) +written += thisWriteSize +if (thisWriteSize < ioSize) { --- End diff -- What will be happened if `thisWriteSize` is smaller than `ioSize`, will Spark throw an exception or something else? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203236014 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -166,6 +170,34 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { } +object ChunkedByteBuffer { + // TODO eliminate this method if we switch BlockManager to getting InputStreams + def fromManagedBuffer(data: ManagedBuffer, maxChunkSize: Int): ChunkedByteBuffer = { +data match { + case f: FileSegmentManagedBuffer => +map(f.getFile, maxChunkSize, f.getOffset, f.getLength) + case other => +new ChunkedByteBuffer(other.nioByteBuffer()) +} + } + + def map(file: File, maxChunkSize: Int, offset: Long, length: Long): ChunkedByteBuffer = { +Utils.tryWithResource(new FileInputStream(file).getChannel()) { channel => --- End diff -- Can we please use `FileChannel#open` instead, FileInputStream/FileOutputStream has some issues (https://www.cloudbees.com/blog/fileinputstream-fileoutputstream-considered-harmful) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21440: [SPARK-24307][CORE] Support reading remote cached...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21440#discussion_r203235292 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -17,17 +17,21 @@ package org.apache.spark.util.io -import java.io.InputStream +import java.io.{File, FileInputStream, InputStream} import java.nio.ByteBuffer -import java.nio.channels.WritableByteChannel +import java.nio.channels.{FileChannel, WritableByteChannel} + +import scala.collection.mutable.ListBuffer import com.google.common.primitives.UnsignedBytes -import io.netty.buffer.{ByteBuf, Unpooled} import org.apache.spark.SparkEnv import org.apache.spark.internal.config +import org.apache.spark.network.buffer.{FileSegmentManagedBuffer, ManagedBuffer} import org.apache.spark.network.util.ByteArrayWritableChannel import org.apache.spark.storage.StorageUtils +import org.apache.spark.util.Utils + --- End diff -- nit. This blank line seems not necessary. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r203232034 --- Diff: docs/monitoring.md --- @@ -435,6 +435,7 @@ set of sinks to which metrics are reported. The following instances are currentl * `executor`: A Spark executor. * `driver`: The Spark driver process (the process in which your SparkContext is created). * `shuffleService`: The Spark shuffle service. +* `applicationMaster`: The Spark application master on YARN. --- End diff -- I think it would be better to clarify as "The Spark ApplicationMaster when running on YARN." --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r203228423 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMasterSource.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.yarn + +import com.codahale.metrics.{Gauge, MetricRegistry} + +import org.apache.spark.metrics.source.Source + +private[spark] class ApplicationMasterSource(yarnAllocator: YarnAllocator) extends Source { + + override val sourceName: String = "applicationMaster" --- End diff -- @tgravescs Would you please explain more, are you going to add a new configuration "spark.metrics.namespace", also how do you use this configuration? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21784: [SPARK-24182][YARN][FOLLOW-UP] Turn off noisy log...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21784#discussion_r202929476 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -111,7 +111,7 @@ private[spark] class YarnClientSchedulerBackend( override def run() { try { val YarnAppReport(_, state, diags) = - client.monitorApplication(appId.get, logApplicationReport = true) + client.monitorApplication(appId.get, logApplicationReport = false) --- End diff -- Yes, it's too verbose currently in the client mode. I remembered we only have such output in cluster mode YARN client. My only concern is that turning to `false` will also lose the detailed reports. I think it would be better if we still have the detailed report when state is changed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21756 Would you please explain the scenarios of such usage? This `SparkHadoopUtil` is highly hadoop/yarn dependent, I'm not sure how other customized cluster manager use it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r202886551 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMasterSource.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.yarn + +import com.codahale.metrics.{Gauge, MetricRegistry} + +import org.apache.spark.metrics.source.Source + +private[spark] class ApplicationMasterSource(yarnAllocator: YarnAllocator) extends Source { + + override val sourceName: String = "applicationMaster" --- End diff -- In case this is the metrics output: ``` -- Gauges -- applicationMaster.numContainersPendingAllocate value = 0 applicationMaster.numExecutorsFailed value = 3 applicationMaster.numExecutorsRunning value = 9 applicationMaster.numLocalityAwareTasks value = 0 applicationMaster.numReleasedContainers value = 0 ... ``` I would suggest to add application id as a prefix to differentiate between different apps. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r202886077 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -309,6 +312,16 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends finish(FinalApplicationStatus.FAILED, ApplicationMaster.EXIT_UNCAUGHT_EXCEPTION, "Uncaught exception: " + StringUtils.stringifyException(e)) +} finally { + try { +metricsSystem.foreach { ms => + ms.report() + ms.stop() +} + } catch { +case e: Exception => + logInfo("Exception during stopping of the metric system: ", e) --- End diff -- I would suggest to change to warning log if exception occurred. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/21635#discussion_r202015739 --- Diff: docs/monitoring.md --- @@ -435,6 +435,7 @@ set of sinks to which metrics are reported. The following instances are currentl * `executor`: A Spark executor. * `driver`: The Spark driver process (the process in which your SparkContext is created). * `shuffleService`: The Spark shuffle service. +* `yarn`: Spark resource allocations on YARN. --- End diff -- Is it better to change to application master for better understanding? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21664: [SPARK-24687][CORE] NoClassDefFoundError will not be cat...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21664 You already got an uncaught exception, there's no need to add warning log. Besides, this is a fatal error, how will let the job continue with such error? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21664: [SPARK-24687][CORE] NoClassDefFoundError will not be cat...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21664 The issue is not introduced by Spark itself, it is introduced by user code, is it better to fix in the user side rather than in Spark? Besides, I'm not so sure that Spark should take responsibility for user issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org