[GitHub] spark issue #14807: [SPARK-17256][Deploy, Windows]Check before adding double...
Github user tsudukim commented on the issue: https://github.com/apache/spark/pull/14807 Hi @qualiu, I had a quick look. I believe spark-submit.cmd that contains space in its path worked fine when #10789 is merged, so I wonder if it is the problem of `cmd /V /E /C`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13673] [Windows] Fixed not to pollute e...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/11516 [SPARK-13673] [Windows] Fixed not to pollute environment variables. ## What changes were proposed in this pull request? This patch fixes the problem that `bin\beeline.cmd` pollutes environment variables. The similar problem is reported and fixed in https://issues.apache.org/jira/browse/SPARK-3943, but `bin\beeline.cmd` seems to be added later. ## How was this patch tested? manual tests: I executed the new `bin\beeline.cmd` and confirmed that %SPARK_HOME% doesn't remain in the command prompt. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-13673 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11516.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11516 commit 21da29590089ddf0c17243ac12b2dcd06b429df3 Author: Masayoshi TSUZUKI Date: 2016-03-04T08:48:09Z [SPARK-13673] [Windows] Fixed not to pollute environment variables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13592][Windows] fix path of spark-submi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/11442#issuecomment-191007237 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13592][Windows] fix path of spark-submi...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/11442 [SPARK-13592][Windows] fix path of spark-submit2.cmd in spark-submit.cmd ## What changes were proposed in this pull request? This patch fixes the problem that pyspark fails on Windows because pyspark can't find ```spark-submit2.cmd```. ## How was this patch tested? manual tests: I runned ```bin\pyspark.cmd``` and checked if pyspark is launched correctly after this patch is applyed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-13592 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11442.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11442 commit 5a8ea5b4f2efd9b5b9e3b24640b72a7edec58eee Author: Masayoshi TSUZUKI Date: 2016-03-01T09:02:36Z [SPARK-13592][Windows] fix path of spark-submit2.cmd in spark-submit.cmd. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11518] [Deploy, Windows] Handle spaces ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/10789#issuecomment-174880482 I think just adding the quotation is good to solve this problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11518] [Deploy, Windows] Handle spaces ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/10789#issuecomment-173270284 Thank you @tritab . Actually I'm not trying your new PR yet (because my new PC is not set up), but I looked into your code and I've got 2 concerns about using `cd` or `pushd` in the scripts. The 1st is that as @tritab already mentioned, if we ctrl-c, our terminal might be left in the SPARK_HOME folder. And the 2nd is, if we change current directory, I'm worried that some command which specifies relative path doesn't work properly. For example, when we execute spark-submit on yarn, we specify application JAR file like this: ``` bin/spark-submit.cmd --master yarn ...(snip)... lib\spark-examples*.jar ``` If we change current directory, the relative path seems not to work. Same problem might occur in other situations, like sending JARs when `spark-submit`, or loading script when `spark-shell` or when `pyspark` etc. Did you face some problems to use double quotations like ``` cmd /V /E /C "%~dp0spark-shell2.cmd" %* ``` instead of using `pushd` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11518] [Deploy, Windows] Handle spaces ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/10789#issuecomment-172741122 @JoshRosen Thank you for your involvement. It seems a good fix, but it doesn't work for my environment because we should fix more files to handle spaces properly. For example, in `pyspark2.cmd` we should also fix these `call` lines because %SPARK_HOME% contains space. ``` ...(snip)... call %SPARK_HOME%\bin\load-spark-env.cmd ...(snip)... call %SPARK_HOME%\bin\spark-submit2.cmd pyspark-shell-main --name "PySparkShell" %* ``` As this is just a example, there are many other codes that should be double-quoted other than this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7889] [UI] make sure click the "App ID"...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/6545#issuecomment-107833438 I wonder if it is a good idea that we have a custom HandlerCollection. The cause of the problem is the cache mechanism in `HistoryServer`. The current HistoryServer caches all UI object in `appCache` regardless of whether it is completed or incomplete. I think it's reasonable to fix the cache layer not to retain when incomplete job. Setting `spark.history.retainedApplications = 0` (in order not to use cache) can be a workaround for this problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5447#issuecomment-101337444 @vanzin Thank you for your comments. About Windows path, you're right. Someone might write down like `C:/foo/bar` though `/` is not a correct path separator on Windows. About the other comments, I fixed them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5447#issuecomment-100205421 I'm so sorry to leave it for a very long time. I modified it as your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5447#discussion_r29933238 --- Diff: repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -206,7 +206,8 @@ class SparkILoop( // e.g. file:/C:/my/path.jar -> C:/my/path.jar SparkILoop.getAddedJars.map { jar => new URI(jar).getPath.stripPrefix("/") } } else { -SparkILoop.getAddedJars +// We need new URI(jar).getPath here for the case that `jar` includes encoded white space (%20). +SparkILoop.getAddedJars.map { jar => new URI(jar).getPath} --- End diff -- scala-2.11 REPL seems not have the equivelent code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-96598393 The problem I mentioned was that the spark-shell.cmd which is called by `SparkLauncherSuite` somehow failed to launch test application. It turned out to be caused by the limitation of Windows batch that the one command line must be shorter than 8192 characters. (The fullpath for classpath was long because I worked at a deep folder.) So I assume all issue is now cleared up. Sorry for my late response. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5227#discussion_r29132469 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java --- @@ -260,15 +260,14 @@ static String quoteForBatchScript(String arg) { quoted.append('"'); break; - case '=': --- End diff -- I've run `SparkLauncherSuite` on Windows and it's OK. If double-quotation is parsed properly, `=` in double-quotation is not need to be escaped. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5447#discussion_r29036669 --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala --- @@ -82,7 +82,7 @@ object PythonRunner { s"spark-submit is currently only supported for local files: $path") } val windows = Utils.isWindows || testWindows -var formattedPath = if (windows) Utils.formatWindowsPath(path) else path +var formattedPath = Utils.formatPath(path, windows) --- End diff -- That's right. I'll try to remove them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-95798208 I was checking about the `SparkLauncherSuite` on Windows as vanzin's comment, and faced some trouble. It seems not to related with this PR, but I'm not sure yet. Please give me more time a little. When I resolve the problem, I'll rebase this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5447#issuecomment-95491028 The problem is that the result is different on Windows and Linux even if input path strings are exactly the same. We can't use the same test code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5447#issuecomment-95045723 I tested only on Windows, but I noticed I get different results on Linux. This is because... On Windows: ``` scala> new File("C:\\path\\to\\file.txt").toURI res0: java.net.URI = file:/C:/path/to/file.txt ``` But on Linux: ``` scala> new File("C:\\path\\to\\file.txt").toURI res0: java.net.URI = file:/home/tsudukim/.../C:%5Cpath%5Cto%5Cfile.txt ``` So I think the tests for Windows path should be run only on Windows. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5447#issuecomment-93976524 Oh, thank you for all of your comments. Fixing it to use `File.toURI` seems to be better. I've fixed this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5447#discussion_r28302293 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1659,9 +1659,14 @@ private[spark] object Utils extends Logging { val windowsDrive = "([a-zA-Z])".r /** - * Format a Windows path such that it can be safely passed to a URI. + * Format a path such that it can be safely passed to a URI. */ - def formatWindowsPath(path: String): String = path.replace("\\", "/") + def formatPath(path: String, windows: Boolean): String = { +val formatted = path.replace(" ", "%20") --- End diff -- Do you mean we should assume `path` never contains fragment? If we execute `bin/spark-shell --jars hdfs:/path/to/jar1.jar`, URI-style string is passed to `path`. So I thought `path` may have fragment. (though I wonder if there are any good working example of the path with fragment...) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-92251922 Ah, sorry to be late... I didn't have time to struggle with it. I'll do it this week. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5447#discussion_r28211905 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1659,9 +1659,14 @@ private[spark] object Utils extends Logging { val windowsDrive = "([a-zA-Z])".r /** - * Format a Windows path such that it can be safely passed to a URI. + * Format a path such that it can be safely passed to a URI. */ - def formatWindowsPath(path: String): String = path.replace("\\", "/") + def formatPath(path: String, windows: Boolean): String = { +val formatted = path.replace(" ", "%20") --- End diff -- When path contains `#`, it doesn't work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5347#issuecomment-90773883 OK, but is it available to reopen the merged pull request? I can't find the reopen button. If we can't reopen it, I'll send another PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5347#discussion_r27783089 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1651,7 +1651,7 @@ private[spark] object Utils extends Logging { /** * Format a Windows path such that it can be safely passed to a URI. */ - def formatWindowsPath(path: String): String = path.replace("\\", "/") + def formatWindowsPath(path: String): String = path.replace("\\", "/").replace(" ", "%20") --- End diff -- Ah, forgot to consider Linux since I have hardly ever seen the path with spaces in Linux. I'll move this code to somewhere else. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6673] spark-shell.cmd can't start in Wi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5328#issuecomment-89887535 @srowen This version is OK to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5347#issuecomment-89237471 This PR requires #5227 merged. (https://issues.apache.org/jira/browse/SPARK-6435) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6568] spark-shell.cmd --jars option doe...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/5347 [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path escape spaces in the arguments. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-6568 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5347.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5347 commit 9180aafb47b96697492d0d6a87f6061f10e14eeb Author: Masayoshi TSUZUKI Date: 2015-04-03T09:31:47Z [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path escape spaces in the arguments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5227#discussion_r27645518 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java --- @@ -260,15 +260,14 @@ static String quoteForBatchScript(String arg) { quoted.append('"'); break; - case '=': -quoted.append('^'); -break; - default: break; } quoted.appendCodePoint(cp); } +if (arg.codePointAt(arg.length() - 1) == '\\') { +quoted.append("\\"); --- End diff -- Backslash is escape character only when followed by `"`. I assumed it's the only case when the string ends with backslash. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5227#discussion_r27645187 --- Diff: launcher/src/main/java/org/apache/spark/launcher/Main.java --- @@ -101,12 +101,9 @@ public static void main(String[] argsArray) throws Exception { * The method quotes all arguments so that spaces are handled as expected. Quotes within arguments * are "double quoted" (which is batch for escaping a quote). This page has more details about * quoting and other batch script fun stuff: http://ss64.com/nt/syntax-esc.html - * - * The command is executed using "cmd /c" and formatted in single line, since that's the - * easiest way to consume this from a batch script (see spark-class2.cmd). */ private static String prepareWindowsCommand(List cmd, Map childEnv) { -StringBuilder cmdline = new StringBuilder("cmd /c \""); +StringBuilder cmdline = new StringBuilder(""); --- End diff -- Why we are using `cmd.exe /c` in existing *.cmd is that the environment would be polluted by the executing commands. But we already have *2.cmd and *.cmd calls *2.cmd by `cmd.exe /c`, so I think we don't need another `cmd.exe /c` when execute java.exe. Rather, if we use `cmd.exe /c`, we face another problem. We should execute like this command: `"C:\Program Files\Java\jdk1.7.0_67\bin\java.exe" -cp "C:\path\to\somewhere\;..." org.apache.spark.deploy.SparkSubmit ...` But the Launcher returns this string: `cmd.exe /c ""C:\Program Files\Java\jdk1.7.0_67\bin\java.exe" -cp "C:\path\to\somewhere\;..." org.apache.spark.deploy.SparkSubmit ..."` `cmd.exe /c` needs double-quoted command string, but the java.exe path and other arguments might be already double-quoted, so it the returned string is weird. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/5227#discussion_r27644574 --- Diff: launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java --- @@ -260,15 +260,14 @@ static String quoteForBatchScript(String arg) { quoted.append('"'); break; - case '=': -quoted.append('^'); -break; - default: break; } quoted.appendCodePoint(cp); } +if (arg.codePointAt(arg.length() - 1) == '\\') { +quoted.append("\\"); --- End diff -- In batch, backslash `\` followed by double-quotation `"` is parsed as a escape charactor. For example, `\"` means `"` itself. We face a problem in this case: `java.exe -cp "C:\path\to\directory\" ...` The closing `"` doesn't work properly, so we should escape only the last `\`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6673] spark-shell.cmd can't start in Wi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5328#issuecomment-88838658 This problem is introduced by https://github.com/apache/spark/commit/e3eb393961051a48ed1cac756ac1928156aa161f https://issues.apache.org/jira/browse/SPARK-6406 So this seems to affect only master branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-88820468 oops, forgot to include fixed test code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6673] spark-shell.cmd can't start in Wi...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/5328 [SPARK-6673] spark-shell.cmd can't start in Windows even when spark was built added equivalent script to load-spark-env.sh You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-6673 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5328.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5328 commit be3405e284ce52f58aab14b894cc437b8763d327 Author: Masayoshi TSUZUKI Date: 2015-04-02T06:50:26Z [SPARK-6673] spark-shell.cmd can't start in Windows even when spark was built added equivalent script to load-spark-env.sh commit aaefb191f47d4b80509ebbfc3beb03f289d915e2 Author: Masayoshi TSUZUKI Date: 2015-04-02T06:53:51Z removed dust. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-88761331 Ah @vanzin, I didn't understand your suggestion. `CommandBuilderUtils` needs modified to escape comma. But I think we still need to modify `spark-class2.cmd` as well. When we execute `bin\spark-shell.cmd --jars "C:\Path to\jar1.jar,C:\Path to\jar2.jar"`, we get following with original `spark-class2.cmd` even if `CommandBuilderUtils` is fixed to escape comma. `... --jars "C:\Path to\jar1.jar C:\Path to\jar2.jar" ...` but we get with my PR `spark-class2.cmd` `... --jars "C:\Path to\jar1.jar,C:\Path to\jar2.jar" ...` Two job at the same time is surely a problem, so added random string to the filename to reduce the probability this problem occurs. I know it's better to solve this problem without using temporary file if possible, but I have no idea now. If you've got any suggestion, please let me know. I'd really like to replace all *.cmd to powershell rather than write ugly codes like this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-87903556 Java application means the launcher `launcher\src\main\java\org\apache\spark\launcher\Main.java`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/5227#issuecomment-87900130 @vanzin I'm not sure I have got your suggestion right, but as I wrote in JIRA, I think this is not the Java side problem. https://issues.apache.org/jira/browse/SPARK-6435 Exactly the `,` is the problem. When Java application receives the arguments, they are already splitted by comma. If we want to escape comma, we should do it in batch side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6435] spark-shell --jars option does no...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/5227 [SPARK-6435] spark-shell --jars option does not add all jars to classpath Modified to accept double-quotated args properly in spark-shell.cmd. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-6435-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5227.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5227 commit 04f42914a9c5b0337895fbe0ff3df376b4d0966b Author: Masayoshi TSUZUKI Date: 2015-03-27T09:47:34Z [SPARK-6435] spark-shell --jars option does not add all jars to classpath Modified to accept multiple jars in spark-shell.cmd. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5396] Syntax error in spark scripts on ...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/4428 [SPARK-5396] Syntax error in spark scripts on windows. Modified syntax error in spark-submit2.cmd. Command prompt doesn't have "defined" operator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-5396 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4428.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4428 commit ec1846579bb0881615d442329101ff80ce61c13d Author: Masayoshi TSUZUKI Date: 2015-02-06T10:29:48Z [SPARK-5396] Syntax error in spark scripts on windows. Modified syntax error in spark-submit2.cmd. Command prompt doesn't have "defined" operator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1825] Make Windows Spark client work fi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3943#issuecomment-72012756 @vanzin Actually, test for org.apache.spark.deploy.yarn.* fails in Windows even in master branch. I just ignored only that error and checked some new errors are not caused by my patch. (of course new error was not reported) @aniketbhatnagar Sorry for the delay. I managed to do it, but I don't think this patch goes into 1.2.1. 1.3 might be good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1825] Make Windows Spark client work fi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3943#issuecomment-70769057 @sarutak @tgravescs @vanzin Thank you for your comments! Though I don't have enogh time in these several days, I'm going to do it until next week. Sorry for the delay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1825] Make Windows Spark client work fi...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3943#issuecomment-69281466 thanks @andrewor14 for following this. I have tested it on two YARN clusters: 2.3 and 2.5. Both has 1 master and 3 slaves. From Linux client, it works fine on both clusters. From Windows client, it still doesn't work on 2.3 cluster but works fine on 2.5 cluster. This patch is implemented as to work across YARN versions. When we apply this patch on old YARN clusters (2.2, 2.3), the behaviour doesn't change from now because just the function which the current code uses ($()) is called by reflection. Only when we apply this patch on new YARN clusters (2.4, 2.5), cross-plaform version of function ($$()) is called. But I think someone else should test this on their variety of clusters as well because it depends on the environment strongly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1825] Make Windows Spark client work fi...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3943 [SPARK-1825] Make Windows Spark client work fine with Linux YARN cluster Modified environment strings and path separators to platform-independent style if possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-1825 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3943.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3943 commit 3d03d353ca1aa45308198565c6343c4d5a2e Author: Masayoshi TSUZUKI Date: 2014-12-04T22:47:27Z [SPARK-1825] Make Windows Spark client work fine with Linux YARN cluster Modified environment strings and path separators to platform-independent style if possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-68837548 resolved conflicts! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/3467#discussion_r22270614 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -180,14 +176,15 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis appListener.startTime.getOrElse(-1L), appListener.endTime.getOrElse(-1L), getModificationTime(dir), - appListener.sparkUser.getOrElse(NOT_STARTED))) + appListener.sparkUser.getOrElse(NOT_STARTED), + !fs.isFile(new Path(dir.getPath(), EventLoggingListener.APPLICATION_COMPLETE } catch { case e: Exception => logInfo(s"Failed to load application log data from $dir.", e) None } } -.sortBy { info => -info.endTime } +.sortBy { info => (-info.endTime, -info.startTime) } --- End diff -- About completed applications, they are sorted by endTime because they have proper endTime (almost unique). About incomplete applications, they are sorted by startTime because they have the same invalid endTime(-1). As the first order is not effective, the second order is effective. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-68043299 Thank you for your comments! I'm going to do it in a few days! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-66810500 @andrewor14 Thank you for following this ticket. I finished rebasing to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3060] spark-shell.cmd doesn't accept ap...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3350#issuecomment-66720899 Hi @andrewor14 yes I've tested on my environment. Would you check it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2188] Support sbt/sbt for Windows
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3591#issuecomment-65837669 I wonder which is good but I tend not to think to submit this to upstream. It is a good idea if this was made from the latest sbt script, but unfortunately this is made from our sbt script which is an old version. If we would try to submit this to upstream, we should update this as equivalent to the latest sbt, which means the updated windows script will differ from our current linux sbt script so it will be more difficult to maintain. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
Github user tsudukim closed the pull request at: https://github.com/apache/spark/pull/3280 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3280#issuecomment-65834528 Thank you! @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2188] Support sbt/sbt for Windows
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3591#issuecomment-65638788 @pwendell Thank you for your comment. I quite agree that Windows script like .cmd or .bat is very high cost for maintainance, but this time I used PowerShell which is a "scripting language" differently from .cmd or .bat. You can see the script inside. Linux version and PowerShell version have same structure (functions, variables, ...) so I think it's easier to read or modify. And Yes, I use Windows for daily Spark development. For sbt and maven, sbt is much better for trial and erro development as you know. I think the reason why I want sbt is the same as why we use sbt rather than maven for development on linux. I also use maven as a final check, but sbt is more useful for continuous development. About cygwin, I'm not using cygwin. Cygwin environment is so polluted by cygwin functions or cygwin variables that the behavior of Windows becomes strange. That's critical for enterprise systems. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4701] Typo in sbt/sbt
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3560 [SPARK-4701] Typo in sbt/sbt Modified typo. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4701 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3560 commit 1af3a35be8d154d9cc92650def72f6f0f1b5edc4 Author: Masayoshi TSUZUKI Date: 2014-12-02T19:43:51Z [SPARK-4701] Typo in sbt/sbt Modified typo. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4642] Documents about running-on-YARN n...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3500#issuecomment-65169336 @sryza and @tgravescs Thank you for your review. I removed them. Only `spark.yarn.queue` is added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-64922238 @ryan-williams Thank you for your review! I fixed them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4598] use pagination to show tasktable
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3456#issuecomment-64919851 I don't think it's a good idea that we lose the way to sort tasks globally by other than launch time. About OOM, I wrote something to the JIRA ticket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4642] Documents about running-on-YARN n...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3500 [SPARK-4642] Documents about running-on-YARN needs update Added descriptions about these parameters. - spark.yarn.report.interval - spark.yarn.queue - spark.yarn.user.classpath.first - spark.yarn.scheduler.reporterThread.maxFailures Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4642 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3500.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3500 commit 88cac9be04bceb76e82ac68445ff0e3ddaab89f7 Author: Masayoshi TSUZUKI Date: 2014-11-28T01:00:25Z [SPARK-4642] Documents about running-on-YARN needs update Added descriptions about these parameters. - spark.yarn.report.interval - spark.yarn.queue - spark.yarn.user.classpath.first - spark.yarn.scheduler.reporterThread.maxFailures Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4634] Enable metrics for each applicati...
Github user tsudukim closed the pull request at: https://github.com/apache/spark/pull/3489 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4634] Enable metrics for each applicati...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3489#issuecomment-64816105 Sorry, GraphiteSink has already got the option "prefix" and it works fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4634] Enable metrics for each applicati...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3489#issuecomment-64737931 Please see https://issues.apache.org/jira/browse/SPARK-4634 for detail of this problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4634] Enable metrics for each applicati...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3489 [SPARK-4634] Enable metrics for each application to be gathered in one node Added configuration for adding top level name to the metrics name. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4634 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3489.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3489 commit 6c0171c26ad679a2b46a7bca2b33ce17b770c8c3 Author: Masayoshi TSUZUKI Date: 2014-11-27T02:13:44Z [SPARK-4634] Enable metrics for each application to be gathered in one node. Added configuration for adding top level name to the metrics name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1558#issuecomment-64510185 @andrewor14 I created a new PR (#3467) as your comment. Please check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-64510079 And these are the screenshots of new UI. A new link for the page of incomplete applications is added at bottom. ("Show incomplete applications") ![spark-2458-complete-2](https://cloud.githubusercontent.com/assets/8070366/5195868/2d7646a8-7567-11e4-9c99-fc33f74e1309.png) And this is the new separated page for incomplete applications. Incomplete apps donsn't have the end time generally so end time are all shown as "-". ![spark-2458-incomplete-2](https://cloud.githubusercontent.com/assets/8070366/5195867/2cdf24b2-7567-11e4-9850-17a605c16203.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-64509912 In this PR, the sort order is (-endTime, -startTime) which means that the sorting is still end time for completed apps but is start time for incomplete apps because incomplete apps don't have the end time generally. Please see also the discuss in this previous PR about the need of this feature. https://github.com/apache/spark/pull/1558 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3467 [SPARK-2458] Make failed application log visible on History Server Enabled HistoryServer to show incomplete applications. We can see the log for incomplete applications by clicking the bottom link. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-2458-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3467.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3467 commit 66fc15b75a16a3dd0345c0fca0babc56c2529af9 Author: Masayoshi TSUZUKI Date: 2014-11-25T08:00:03Z [SPARK-2458] Make failed application log visible on History Server Enabled HistoryServer to show incomplete applications. We can see the log for incomplete applications by clicking the bottom link. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3060] spark-shell.cmd doesn't accept ap...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3350 [SPARK-3060] spark-shell.cmd doesn't accept application options in Windows OS Added equivalent module as utils.sh and modified spark-shell2.cmd to use it to parse options. Now we can use application options. ex) `bin\spark-shell.cmd --master spark://master:7077 -i path\to\script.txt` You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3060 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3350.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3350 commit 3a11361ac788193eb4bc2acad823e7683bd87062 Author: Masayoshi TSUZUKI Date: 2014-11-18T23:28:23Z [SPARK-3060] spark-shell.cmd doesn't accept application options in Windows OS Added equivalent module as utils.sh and modified spark-shell2.cmd to use it to parse options. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4464] Description about configuration o...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3329 [SPARK-4464] Description about configuration options need to be modified in docs. Added description about -h and -host. Modified description about -i and -ip which are now deprecated. Added description about --properties-file. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4464 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3329.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3329 commit 6c07caf5d6a82699bfefeed8d8adcc27285f68bf Author: Masayoshi TSUZUKI Date: 2014-11-18T00:13:14Z [SPARK-4464] Description about configuration options need to be modified in docs. Added description about -h and -host. Modified description about -i and -ip which are now deprecated. Added description about --properties-file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3280#issuecomment-63205228 Thank you @srowen for following this ticket. I know PR should be for master generally, and I already sent one for master (#3279). But the details of modification for master and for branch-1.1 are different because the filename in master was changed from branch-1.1, so i thought branch-1.1 requires another PR and sent this (#3280). If you don't need this PR, I will just close it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3279#issuecomment-63162179 I sent 2 PRs about [SPARK-4421](https://issues.apache.org/jira/browse/SPARK-4421) because the page name are different between Spark 1.2 and Spark 1.1. This is for Spark 1.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3280#issuecomment-63162188 I sent 2 PRs about [SPARK-4421](https://issues.apache.org/jira/browse/SPARK-4421) because the page name are different between Spark 1.2 and Spark 1.1. This is for Spark 1.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3280 [SPARK-4421] Wrong link in spark-standalone.html Modified the link of building Spark. (backport version of #3279.) You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4421-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3280.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3280 commit 3b4d38d30c71685804fa71ae9dbfaf0068f0e384 Author: Masayoshi TSUZUKI Date: 2014-11-15T01:58:54Z [SPARK-4421] Wrong link in spark-standalone.html Modified the link of building Spark. (backport version of #3279.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4421] Wrong link in spark-standalone.ht...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/3279 [SPARK-4421] Wrong link in spark-standalone.html Modified the link of building Spark. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-4421 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3279.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3279 commit 56e31c1459044aaacd183ecb82f8ca6dcd040bb7 Author: Masayoshi TSUZUKI Date: 2014-11-15T01:38:02Z Modified the link of building Spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3943] Some scripts bin\*.cmd pollutes e...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/2797#issuecomment-59146580 @andrewor14 thank you for following this PR. Yes that's I mean. I'm not observing any problems, this is just a safeguard. Polluting environment might affect noy only spark scripts but also non-spark batch scripts and applications. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3943] Some scripts bin\*.cmd pollutes e...
Github user tsudukim commented on a diff in the pull request: https://github.com/apache/spark/pull/2797#discussion_r18870526 --- Diff: bin/spark-shell2.cmd --- @@ -0,0 +1,22 @@ +@echo off + +rem +rem Licensed to the Apache Software Foundation (ASF) under one or more +rem contributor license agreements. See the NOTICE file distributed with +rem this work for additional information regarding copyright ownership. +rem The ASF licenses this file to You under the Apache License, Version 2.0 +rem (the "License"); you may not use this file except in compliance with +rem the License. You may obtain a copy of the License at +rem +remhttp://www.apache.org/licenses/LICENSE-2.0 +rem +rem Unless required by applicable law or agreed to in writing, software +rem distributed under the License is distributed on an "AS IS" BASIS, +rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +rem See the License for the specific language governing permissions and +rem limitations under the License. +rem + +set SPARK_HOME=%~dp0.. + +cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %* spark-shell --- End diff -- Yes it is. Although this script now has only one environment variable `SPARK_HOME`, it might be more complicated in future just like bin/spark-shell (linux version), then it might export other environment variables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3943] Some scripts bin\*.cmd pollutes e...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/2797#issuecomment-59027470 Please merge this *AFTER* #2796 is merged, because /python/docs/make2.bat will be ignored by .gitignore in /python by mistake. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3943] Some scripts bin\*.cmd pollutes e...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2797 [SPARK-3943] Some scripts bin\*.cmd pollutes environment variables in Windows Modified not to pollute environment variables. Just moved the main logic into `XXX2.cmd` from `XXX.cmd`, and call `XXX2.cmd` with cmd command in `XXX.cmd`. `pyspark.cmd` and `spark-class.cmd` are already using the same way, but `spark-shell.cmd`, `spark-submit.cmd` and `/python/docs/make.bat` are not. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3943 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2797.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2797 commit b397a7df5004bab26afd6f9650551dc8ed6af5f1 Author: Masayoshi TSUZUKI Date: 2014-10-14T11:02:23Z [SPARK-3943] Some scripts bin\*.cmd pollutes environment variables in Windows Modified not to pollute environment variables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3946] gitignore in /python includes wro...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2796 [SPARK-3946] gitignore in /python includes wrong directory Modified to ignore not the docs/ directory, but only the docs/_build/ which is the output directory of sphinx build. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3946 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2796.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2796 commit 2bea6a9ccd8216f6bc641416230376b2d5a41744 Author: Masayoshi TSUZUKI Date: 2014-10-14T11:05:52Z [SPARK-3946] gitignore in /python includes wrong directory Modified to ignore not the docs/ directory, but only the docs/_build/ which is the output directory of sphinx build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3808] PySpark fails to start in Windows
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2669 [SPARK-3808] PySpark fails to start in Windows Modified syntax error of *.cmd script. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3808 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2669.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2669 commit 7f804e6cb7001b1be372940eb186750e4154a83f Author: Masayoshi TSUZUKI Date: 2014-10-06T07:40:07Z [SPARK-3808] PySpark fails to start in Windows Modified syntax error of *.cmd script. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3775] Not suitable error message in spa...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/2640#issuecomment-57772760 Yes, so I removed the specific recommendation for SBT. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3775] Not suitable error message in spa...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2640 [SPARK-3775] Not suitable error message in spark-shell.cmd Modified some sentence of error message in bin\*.cmd. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3775 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2640.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2640 commit 3458afb235fcf56a437650467ea05f176d450202 Author: Masayoshi TSUZUKI Date: 2014-10-03T09:07:41Z [SPARK-3775] Not suitable error message in spark-shell.cmd Modified some sentence of error message in bin\*.cmd. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3774] typo comment in bin/utils.sh
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2639 [SPARK-3774] typo comment in bin/utils.sh Modified the comment of bin/utils.sh. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3774 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2639.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2639 commit 707b7798a9a6350d3e2f85eada93ca135a6a61ca Author: Masayoshi TSUZUKI Date: 2014-10-03T02:15:03Z [SPARK-3774] typo comment in bin/utils.sh Modified the comment of bin/utils.sh. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3758] [Windows] Wrong EOL character in ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/2612#issuecomment-57606210 Generally, using LF as EOL character of *.cmd files may cause some troubles. For example, when the *.cmd file includes LF and multibyte character, some characters of head of line might be removed internally for some reason. The problem @sarutak mentioned seems to be same as this. In another case, "goto" may go to wrong place. This problem occured once in ruby-lang project. Their build script for Windows happened to be LF, and they faced "strange behaviour of cmd.exe". https://bugs.ruby-lang.org/issues/10145 I know *.cmd with LF runs seemingly well in many cases, but sometimes cause inexplicable trouble because LF is defenetely not the proper EOL character of Windows *.cmd files. So if possible, I think we should use CRLF as EOL to avoid such trouble. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3757] mvn clean doesn't delete some fil...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/2613 [SPARK-3757] mvn clean doesn't delete some files Added directory to be deleted into maven-clean-plugin in pom.xml. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3757 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2613.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2613 commit 67c7171e277d099e87972302ee798a310c0da2e6 Author: Masayoshi TSUZUKI Date: 2014-10-01T09:09:56Z [SPARK-3757] mvn clean doesn't delete some files Added directory to be deleted into maven-clean-plugin. commit 8804bfc9f4cc7fe4f803f3145c1fa7f5bc902d70 Author: Masayoshi TSUZUKI Date: 2014-10-01T09:39:32Z Modified indent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim closed the pull request at: https://github.com/apache/spark/pull/1558 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1558#issuecomment-57430441 Thank you @andrewor14 I've researched this problem these days with our environment and it turned out to be a very rare case as @vanzin suggested first. (like jvm lost and failed to call SparkContext::stop(), failed to write to HDFS for some reason, etc) And my PR is not the smart way to solve the rare case. so I drop this PR. Thank you for your comments again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3006] Failed to execute spark-shell in ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1918#issuecomment-52134614 Thanks @andrewor14 to follow this PR. You're right so i modified to put {{%*}} before {{spark-shell}}. but application arguments are not available yet until we make change like #1825 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3006] Failed to execute spark-shell in ...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/1918 [SPARK-3006] Failed to execute spark-shell in Windows OS Modified the order of the options and arguments in spark-shell.cmd You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-3006 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1918.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1918 commit 1a32410822f1a0372bbad0695579b8ef5973fe1f Author: Masayoshi TSUZUKI Date: 2014-08-13T08:26:01Z [SPARK-3006] Failed to execute spark-shell in Windows OS Modified the order of the options and arguments in spark-shell.cmd --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim closed the pull request at: https://github.com/apache/spark/pull/1516 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-50313978 SPARK-2567 is resolved by #1566. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1558#issuecomment-50060159 Thank you for following this PR. Let me explain a little. I'm sorry I made you misunderstand my purpose with the improper word "uncompleted". The purpose of this PR is to show "failed" apps in the HS, but not the running apps. But it is true that we can't recognize if the app already failed or still running from the log in this way, so as a result they both show up in the HS. First point, the purpose is to show failed apps in the past, so this PR still matches for the concept of HS. Second point, the target of this PR is apps that never go into "finished" state. And third point, sorting ways are the same in the both mode. But your suggestion makes sense. Separate table or tab might be better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1558#issuecomment-49956854 We get the same ui as now by default. ![spark-2458-notinclude](https://cloud.githubusercontent.com/assets/8070366/3682544/dca4bb96-12cf-11e4-9965-0efa231babd9.png) When clicked the link above the table, we can also get the list that also include the apps which doesn't finished successfully. ![spark-2458-include](https://cloud.githubusercontent.com/assets/8070366/3682546/e191fc54-12cf-11e4-9c93-a4a3115f82f2.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/1558 [SPARK-2458] Make failed application log visible on History Server Modified to show uncompleted applications in History Server ui. Modified apps sort rule to startTime base (originally it was endTime base) because uncompleted apps doesn't have proper endTime. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-2458 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1558.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1558 commit 503d8abb9ae24eb6b211481a60f4b348d125a69a Author: Masayoshi TSUZUKI Date: 2014-07-16T00:12:42Z [SPARK-2458] Make failed application log visible on History Server Modified to show completed applications in History Server ui. Modified apps sort rule to startTime base (originally it was endTime base) because failed apps doesn't have proper endTime. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2298: Show stage attempt in UI
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1384#issuecomment-49944917 @lianhuiwang It appears to be a different problem to SPARK-2298. Is your aim same as this ticket? https://issues.apache.org/jira/browse/SPARK-1362 If so, how about creating another PR to modify it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2298: Show stage attempt in UI
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1384#issuecomment-49944747 @rxin Surely we can also fix them all in one patch. But it can be a little bit hard work to modify them compatibly in one patch so I just have thought to separate into several tasks and to make #1384 as first step by showing only attemptId to distinguish attempts. You can take whichever is convenient for you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-49940743 Hi @rxin, thank you for following this ticket but couldn't we separate those problems into different PRs? SPARK-2298 is not about this problem. I think we will be hard to trace why the code was modified and what discussion was made on the topic later. (just like we are now wondering about the intention of the commit of the last year) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-49940152 The test totally succeeded again. If the @xiajunluan 's commit only aimed to avoid the unit test error, I think it should be reversioned as this PR. But I'm wondering if there were another aim. @xiajunluan could you remember it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-49919528 Hmm... I didn't notice it. I'm going to rerun the test for confirmation as @xiajunluan 's commit comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1516#issuecomment-49909562 You can see the screenshot which the original code generated in the JIRA. https://issues.apache.org/jira/browse/SPARK-2567 This screenshot was taken after the job completed but one stage remained as Active Stage forever. It shouldn't be displayed in this web ui at all because the corresponding new TaskSet is not submitted and even stage.newAttemptId() isn't called. Sometimes this ghost stage appears when stage is re-submitted so this PR modified to prevent web ui from showing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2567] Resubmitted stage sometimes remai...
GitHub user tsudukim opened a pull request: https://github.com/apache/spark/pull/1516 [SPARK-2567] Resubmitted stage sometimes remains as active stage in the web UI Moved the line which post SparkListenerStageSubmitted to the back of check of tasks size and serializability. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tsudukim/spark feature/SPARK-2567 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1516.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1516 commit 79f4f98f5d08b7f52bb0216f2b412d959e64ad89 Author: Masayoshi TSUZUKI Date: 2014-07-21T21:05:42Z [SPARK-2567] Resubmitted stage sometimes remains as active stage in the web UI Moved the line which post SparkListenerStageSubmitted to the back of check of tasks size and serializability. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2298: Show stage attempt in UI
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1384#issuecomment-49495727 Modified PR as your comments. thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2481: The environment variables SPARK_HI...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1341#issuecomment-49250255 Looks good, but this patch seems to includes some unrelated diffs to SPARK-2481. * conf/spark-env.sh.template * docs/spark-standalone.md * sbin/spark-config.sh --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2298: Show stage attempt in UI
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/1384#issuecomment-49209857 @rxin in #1262, can I expect the key of the stagedata in JobProgressListener become stageId + attemptId instead of stageId only? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---