[jira] [Updated] (SUBMARINE-252) Launcher user workspace on YARN
[ https://issues.apache.org/jira/browse/SUBMARINE-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-252: - Summary: Launcher user workspace on YARN (was: Launcher use workspace on YARN) > Launcher user workspace on YARN > --- > > Key: SUBMARINE-252 > URL: https://issues.apache.org/jira/browse/SUBMARINE-252 > Project: Apache Submarine > Issue Type: Sub-task > Components: Submarine Workbench >Reporter: Liu Xun >Priority: Major > > The Submarine Workbench Server submits the user's Workspace container to Yarn > (requires support for the docker module) by using LauncherYarn. Provides a > scaleable Workspace container runtime environment with yarn. > > Submarine Workbench Server creates a user's Workspace container in the native > Docker environment by using launcherDocker. In addition, Workbench Server > integrates Submarine Cluster Server to enable multiple Submarine Workbench > Server groups to be built into Submarine Workbench Server Cluster, providing > a scaleable Workspace container runtime environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (SUBMARINE-57) Add more elaborate message if submarine command is not recognized
[ https://issues.apache.org/jira/browse/SUBMARINE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-57: Fix Version/s: 0.3.0 > Add more elaborate message if submarine command is not recognized > - > > Key: SUBMARINE-57 > URL: https://issues.apache.org/jira/browse/SUBMARINE-57 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.1, 0.3.0 > > Attachments: SUBMARINE-57.001.patch, SUBMARINE-57.001.patch > > > In {{org.apache.hadoop.yarn.submarine.client.cli.Cli#main}}, we have this > error handling: > {code:java} > if (args[0].equals("job")) { > String subCmd = args[1]; > if (subCmd.equals(CliConstants.RUN)) { > new RunJobCli(clientContext).run(moduleArgs); > } else if (subCmd.equals(CliConstants.SHOW)) { > new ShowJobCli(clientContext).run(moduleArgs); > } else { > printHelp(); > throw new IllegalArgumentException("Unknown option for job"); > } > } else { > printHelp(); > throw new IllegalArgumentException("Bad parameters "); > } > {code} > "Bad parameters " need to be replaced with someting making more sense. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-57) Add more elaborate message if submarine command is not recognized
[ https://issues.apache.org/jira/browse/SUBMARINE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903788#comment-16903788 ] Szilard Nemeth commented on SUBMARINE-57: - Thanks [~adam.antal] for this patch, committed to trunk and branch-3.2! > Add more elaborate message if submarine command is not recognized > - > > Key: SUBMARINE-57 > URL: https://issues.apache.org/jira/browse/SUBMARINE-57 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-57.001.patch, SUBMARINE-57.001.patch > > > In {{org.apache.hadoop.yarn.submarine.client.cli.Cli#main}}, we have this > error handling: > {code:java} > if (args[0].equals("job")) { > String subCmd = args[1]; > if (subCmd.equals(CliConstants.RUN)) { > new RunJobCli(clientContext).run(moduleArgs); > } else if (subCmd.equals(CliConstants.SHOW)) { > new ShowJobCli(clientContext).run(moduleArgs); > } else { > printHelp(); > throw new IllegalArgumentException("Unknown option for job"); > } > } else { > printHelp(); > throw new IllegalArgumentException("Bad parameters "); > } > {code} > "Bad parameters " need to be replaced with someting making more sense. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-57) Add more elaborate message if submarine command is not recognized
[ https://issues.apache.org/jira/browse/SUBMARINE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-57: Fix Version/s: (was: 0.2.0) 0.2.1 > Add more elaborate message if submarine command is not recognized > - > > Key: SUBMARINE-57 > URL: https://issues.apache.org/jira/browse/SUBMARINE-57 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.1 > > Attachments: SUBMARINE-57.001.patch, SUBMARINE-57.001.patch > > > In {{org.apache.hadoop.yarn.submarine.client.cli.Cli#main}}, we have this > error handling: > {code:java} > if (args[0].equals("job")) { > String subCmd = args[1]; > if (subCmd.equals(CliConstants.RUN)) { > new RunJobCli(clientContext).run(moduleArgs); > } else if (subCmd.equals(CliConstants.SHOW)) { > new ShowJobCli(clientContext).run(moduleArgs); > } else { > printHelp(); > throw new IllegalArgumentException("Unknown option for job"); > } > } else { > printHelp(); > throw new IllegalArgumentException("Bad parameters "); > } > {code} > "Bad parameters " need to be replaced with someting making more sense. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-57) Add more elaborate message if submarine command is not recognized
[ https://issues.apache.org/jira/browse/SUBMARINE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-57: Resolution: Fixed Fix Version/s: 0.2.0 Status: Resolved (was: Patch Available) > Add more elaborate message if submarine command is not recognized > - > > Key: SUBMARINE-57 > URL: https://issues.apache.org/jira/browse/SUBMARINE-57 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.0 > > Attachments: SUBMARINE-57.001.patch, SUBMARINE-57.001.patch > > > In {{org.apache.hadoop.yarn.submarine.client.cli.Cli#main}}, we have this > error handling: > {code:java} > if (args[0].equals("job")) { > String subCmd = args[1]; > if (subCmd.equals(CliConstants.RUN)) { > new RunJobCli(clientContext).run(moduleArgs); > } else if (subCmd.equals(CliConstants.SHOW)) { > new ShowJobCli(clientContext).run(moduleArgs); > } else { > printHelp(); > throw new IllegalArgumentException("Unknown option for job"); > } > } else { > printHelp(); > throw new IllegalArgumentException("Bad parameters "); > } > {code} > "Bad parameters " need to be replaced with someting making more sense. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-49: Attachment: SUBMARINE-49.008.patch > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch, SUBMARINE-49.006.patch, SUBMARINE-49.007.patch, > SUBMARINE-49.007.patch, SUBMARINE-49.008.patch, SUBMARINE-49.008.patch, > SUBMARINE-49.008.patch, SUBMARINE-49.008.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (SUBMARINE-81) Add documentation about YAML config parser with example config
[ https://issues.apache.org/jira/browse/SUBMARINE-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-81: --- Assignee: Adam Antal (was: Szilard Nemeth) > Add documentation about YAML config parser with example config > -- > > Key: SUBMARINE-81 > URL: https://issues.apache.org/jira/browse/SUBMARINE-81 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > > YAML parser was added but there's no example documentation of a valid > configuration. > We need to fix this! -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-57) Add more elaborate message if submarine command is not recognized
[ https://issues.apache.org/jira/browse/SUBMARINE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-57: Attachment: SUBMARINE-57.001.patch > Add more elaborate message if submarine command is not recognized > - > > Key: SUBMARINE-57 > URL: https://issues.apache.org/jira/browse/SUBMARINE-57 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-57.001.patch, SUBMARINE-57.001.patch > > > In {{org.apache.hadoop.yarn.submarine.client.cli.Cli#main}}, we have this > error handling: > {code:java} > if (args[0].equals("job")) { > String subCmd = args[1]; > if (subCmd.equals(CliConstants.RUN)) { > new RunJobCli(clientContext).run(moduleArgs); > } else if (subCmd.equals(CliConstants.SHOW)) { > new ShowJobCli(clientContext).run(moduleArgs); > } else { > printHelp(); > throw new IllegalArgumentException("Unknown option for job"); > } > } else { > printHelp(); > throw new IllegalArgumentException("Bad parameters "); > } > {code} > "Bad parameters " need to be replaced with someting making more sense. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-68) Add tests to FileSystemOperations class
[ https://issues.apache.org/jira/browse/SUBMARINE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-68: Attachment: SUBMARINE-68.004.patch > Add tests to FileSystemOperations class > --- > > Key: SUBMARINE-68 > URL: https://issues.apache.org/jira/browse/SUBMARINE-68 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-68.001.patch, SUBMARINE-68.002.patch, > SUBMARINE-68.003.patch, SUBMARINE-68.004.patch, SUBMARINE-68.004.patch, > SUBMARINE-68.004.patch, SUBMARINE-68.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-69) Add tests to ZipUtilities class
[ https://issues.apache.org/jira/browse/SUBMARINE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-69: Attachment: SUBMARINE-69.002.patch > Add tests to ZipUtilities class > --- > > Key: SUBMARINE-69 > URL: https://issues.apache.org/jira/browse/SUBMARINE-69 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-69.001.patch, SUBMARINE-69.001.patch, > SUBMARINE-69.002.patch, SUBMARINE-69.002.patch, SUBMARINE-69.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-67: Attachment: SUBMARINE-67.002.patch > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-67.001.patch, SUBMARINE-67.002.patch, > SUBMARINE-67.002.patch, SUBMARINE-67.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-66: Attachment: SUBMARINE-66.003.patch > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-66.001.patch, SUBMARINE-66.002.patch, > SUBMARINE-66.002.patch, SUBMARINE-66.003.patch, SUBMARINE-66.003.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Attachment: SUBMARINE-60.003.patch > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch, > SUBMARINE-60.002.patch, SUBMARINE-60.003.patch, SUBMARINE-60.003.patch, > SUBMARINE-60.003.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-49: Attachment: SUBMARINE-49.008.patch > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch, SUBMARINE-49.006.patch, SUBMARINE-49.007.patch, > SUBMARINE-49.007.patch, SUBMARINE-49.008.patch, SUBMARINE-49.008.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Fix Version/s: 0.2.1 > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.1, 0.3.0 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884994#comment-16884994 ] Szilard Nemeth edited comment on SUBMARINE-62 at 7/15/19 11:09 AM: --- Hi [~adam.antal]! Thanks for your contribution! Committed to trunk and submarine-0.2 branches. was (Author: snemeth): Hi [~adam.antal]! Thanks for your contribution! Committed to trunk. > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.3.0 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Fix Version/s: (was: 0.2.0) > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.1 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Resolution: Fixed Fix Version/s: 0.2.0 Status: Resolved (was: Patch Available) > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.0 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Fix Version/s: 0.3.0 > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.1, 0.3.0 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Fix Version/s: 0.2.1 > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.2.0, 0.2.1 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Fix Version/s: (was: 0.2.1) > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Fix For: 0.3.0 > > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884994#comment-16884994 ] Szilard Nemeth commented on SUBMARINE-62: - Hi [~adam.antal]! Thanks for your contribution! Committed to trunk. > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch, > SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-68) Add tests to FileSystemOperations class
[ https://issues.apache.org/jira/browse/SUBMARINE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883031#comment-16883031 ] Szilard Nemeth commented on SUBMARINE-68: - Hi [~adam.antal]! It seems we had some build issues. Would you please re-upload the patch? Thanks! > Add tests to FileSystemOperations class > --- > > Key: SUBMARINE-68 > URL: https://issues.apache.org/jira/browse/SUBMARINE-68 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-68.001.patch, SUBMARINE-68.002.patch, > SUBMARINE-68.003.patch, SUBMARINE-68.004.patch, SUBMARINE-68.004.patch, > SUBMARINE-68.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883026#comment-16883026 ] Szilard Nemeth commented on SUBMARINE-60: - Hi [~adam.antal]! It seems we had some build issues. Would you please re-upload the patch? Thanks! > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch, > SUBMARINE-60.002.patch, SUBMARINE-60.003.patch, SUBMARINE-60.003.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883025#comment-16883025 ] Szilard Nemeth commented on SUBMARINE-49: - Hi [~adam.antal]! It seems latest patch could not be applied to trunk. Would you please resolve the conflicts and re-upload? Thanks! > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch, SUBMARINE-49.006.patch, SUBMARINE-49.007.patch, > SUBMARINE-49.007.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-80) Tensorflow example command is not valid in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883023#comment-16883023 ] Szilard Nemeth commented on SUBMARINE-80: - Hi [~adam.antal]! It seems latest patch could not be applied to trunk. Would you please resolve the conflicts and re-upload the patch? Thanks! > Tensorflow example command is not valid in documentation > > > Key: SUBMARINE-80 > URL: https://issues.apache.org/jira/browse/SUBMARINE-80 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-80.001.patch, SUBMARINE-80.001.patch > > > [This|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/RunningDistributedCifar10TFJobs.html] > document (title: Running Distributed Cifar10 Tensorflow Estimator Example) > has an invalid command: > {code:java} > yarn jar path/to/hadoop-yarn-applications-submarine-3.2.0-SNAPSHOT.jar \ >job run --name tf-job-001 --verbose --docker_image > hadoopsubmarine/tf-1.8.0-gpu:0.0.1 \ >--input_path hdfs://default/dataset/cifar-10-data \ >--env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ >--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 >--num_workers 1 --worker_resources memory=8G,vcores=2,gpu=1 \ >--worker_launch_cmd "cd /test/models/tutorials/image/cifar10_estimator && > python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% > --train-steps=1 --eval-batch-size=16 --train-batch-size=16 --num-gpus=2 > --sync" \ >--tensorboard --tensorboard_docker_image wtan/tf-1.8.0-cpu:0.0.3 > {code} > The command is wrong because there are lines without backslashes. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883022#comment-16883022 ] Szilard Nemeth commented on SUBMARINE-66: - Hi [~adam.antal]! It seems latest patch could not be applied to trunk. Would you please resolve the conflicts and re-upload the patch? Thanks! > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-66.001.patch, SUBMARINE-66.002.patch, > SUBMARINE-66.002.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (SUBMARINE-69) Add tests to ZipUtilities class
[ https://issues.apache.org/jira/browse/SUBMARINE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-69: --- Assignee: Adam Antal (was: Szilard Nemeth) > Add tests to ZipUtilities class > --- > > Key: SUBMARINE-69 > URL: https://issues.apache.org/jira/browse/SUBMARINE-69 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-69.001.patch, SUBMARINE-69.001.patch, > SUBMARINE-69.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-66: --- Assignee: Adam Antal (was: Szilard Nemeth) > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-66.001.patch, SUBMARINE-66.002.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SUBMARINE-68) Add tests to FileSystemOperations class
[ https://issues.apache.org/jira/browse/SUBMARINE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-68: --- Assignee: Adam Antal (was: Szilard Nemeth) > Add tests to FileSystemOperations class > --- > > Key: SUBMARINE-68 > URL: https://issues.apache.org/jira/browse/SUBMARINE-68 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Minor > Attachments: SUBMARINE-68.001.patch, SUBMARINE-68.002.patch, > SUBMARINE-68.003.patch, SUBMARINE-68.004.patch, SUBMARINE-68.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-49: --- Assignee: Adam Antal (was: Szilard Nemeth) > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch, SUBMARINE-49.006.patch, SUBMARINE-49.007.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-62: --- Assignee: Adam Antal (was: Szilard Nemeth) > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-82) Fix english grammar mistakes in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853181#comment-16853181 ] Szilard Nemeth commented on SUBMARINE-82: - Hi [~sunilg]! Sure, uploaded a rebased patch. > Fix english grammar mistakes in documentation > - > > Key: SUBMARINE-82 > URL: https://issues.apache.org/jira/browse/SUBMARINE-82 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-82.001.patch, SUBMARINE-82.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-49: Attachment: SUBMARINE-49.007.patch > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch, SUBMARINE-49.006.patch, SUBMARINE-49.007.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-69) Add tests to ZipUtilities class
[ https://issues.apache.org/jira/browse/SUBMARINE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-69: Attachment: SUBMARINE-69.002.patch > Add tests to ZipUtilities class > --- > > Key: SUBMARINE-69 > URL: https://issues.apache.org/jira/browse/SUBMARINE-69 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-69.001.patch, SUBMARINE-69.001.patch, > SUBMARINE-69.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-68) Add tests to FileSystemOperations class
[ https://issues.apache.org/jira/browse/SUBMARINE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-68: Attachment: SUBMARINE-68.004.patch > Add tests to FileSystemOperations class > --- > > Key: SUBMARINE-68 > URL: https://issues.apache.org/jira/browse/SUBMARINE-68 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-68.001.patch, SUBMARINE-68.002.patch, > SUBMARINE-68.003.patch, SUBMARINE-68.004.patch, SUBMARINE-68.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842164#comment-16842164 ] Szilard Nemeth commented on SUBMARINE-60: - Thanks [~pbacsko]! Added a new patch that adds the VisibleForTesting annotation to the method in question! > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch, > SUBMARINE-60.002.patch, SUBMARINE-60.003.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Attachment: SUBMARINE-60.003.patch > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch, > SUBMARINE-60.002.patch, SUBMARINE-60.003.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-68) Add tests to FileSystemOperations class
[ https://issues.apache.org/jira/browse/SUBMARINE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842161#comment-16842161 ] Szilard Nemeth commented on SUBMARINE-68: - Hi [~adam.antal]! Thanks for the valuable comments! 1. Yep, it was indeed a red flag. Please check my updated solution for this issue. 2. Fixed the javadoc 3. Extracted the temp dir File object into a new static final field as you suggested. 4. Fixed the javadoc of downloadRemoteFile 5. Extracted the constant from getMaxRemoteFileSizeMB 6. Removed the Non HDFS comment as it was not make any sense 7. Fixed the manually crafted File path string as you suggested. Good point! Btw, these are not my code so I kinda inherited and not created file paths by hand, intentionally. Testcases: 1. Indeed, the setupService method was copied from another place and did not make sense in this place at all. 2. Moved FILE_SCHEME to the suggested place. 3. Yes, these files are cleaned up. > Add tests to FileSystemOperations class > --- > > Key: SUBMARINE-68 > URL: https://issues.apache.org/jira/browse/SUBMARINE-68 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-68.001.patch, SUBMARINE-68.002.patch, > SUBMARINE-68.003.patch, SUBMARINE-68.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841986#comment-16841986 ] Szilard Nemeth commented on SUBMARINE-62: - Hi [~pbacsko]! Did a rebase to trunk and fixed the constants you mentioned. Please check the patch again! > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-62) PS_LAUNCH_CMD CLI description is wrong in RunJobCli
[ https://issues.apache.org/jira/browse/SUBMARINE-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-62: Attachment: SUBMARINE-62.002.patch > PS_LAUNCH_CMD CLI description is wrong in RunJobCli > --- > > Key: SUBMARINE-62 > URL: https://issues.apache.org/jira/browse/SUBMARINE-62 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-62.001.patch, SUBMARINE-62.002.patch > > > See: > [https://github.com/apache/hadoop/blob/trunk/hadoop-submarine/hadoop-submarine-core/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java#L118-L120] > Currently, the description is: > "Commandline of worker, arguments will be directly used to launch the PS" > Shouldn't the description start with "Commandline of PS..."? > The rest of it can remain intact. > [~tangzhankun]: What's your opinion? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-82) Fix english grammar mistakes in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-82: Attachment: SUBMARINE-82.001.patch > Fix english grammar mistakes in documentation > - > > Key: SUBMARINE-82 > URL: https://issues.apache.org/jira/browse/SUBMARINE-82 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-82.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-82) Fix english grammar mistakes in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-82: Status: Patch Available (was: In Progress) > Fix english grammar mistakes in documentation > - > > Key: SUBMARINE-82 > URL: https://issues.apache.org/jira/browse/SUBMARINE-82 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-82.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-82) Fix english grammar mistakes in documentation
Szilard Nemeth created SUBMARINE-82: --- Summary: Fix english grammar mistakes in documentation Key: SUBMARINE-82 URL: https://issues.apache.org/jira/browse/SUBMARINE-82 Project: Hadoop Submarine Issue Type: Bug Reporter: Szilard Nemeth Assignee: Szilard Nemeth -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-80) Tensorflow example command is not valid in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-80: Attachment: SUBMARINE-80.001.patch > Tensorflow example command is not valid in documentation > > > Key: SUBMARINE-80 > URL: https://issues.apache.org/jira/browse/SUBMARINE-80 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-80.001.patch > > > [This|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/RunningDistributedCifar10TFJobs.html] > document (title: Running Distributed Cifar10 Tensorflow Estimator Example) > has an invalid command: > {code:java} > yarn jar path/to/hadoop-yarn-applications-submarine-3.2.0-SNAPSHOT.jar \ >job run --name tf-job-001 --verbose --docker_image > hadoopsubmarine/tf-1.8.0-gpu:0.0.1 \ >--input_path hdfs://default/dataset/cifar-10-data \ >--env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ >--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 >--num_workers 1 --worker_resources memory=8G,vcores=2,gpu=1 \ >--worker_launch_cmd "cd /test/models/tutorials/image/cifar10_estimator && > python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% > --train-steps=1 --eval-batch-size=16 --train-batch-size=16 --num-gpus=2 > --sync" \ >--tensorboard --tensorboard_docker_image wtan/tf-1.8.0-cpu:0.0.3 > {code} > The command is wrong because there are lines without backslashes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-80) Tensorflow example command is not valid in documentation
[ https://issues.apache.org/jira/browse/SUBMARINE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-80: Status: Patch Available (was: In Progress) > Tensorflow example command is not valid in documentation > > > Key: SUBMARINE-80 > URL: https://issues.apache.org/jira/browse/SUBMARINE-80 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-80.001.patch > > > [This|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/RunningDistributedCifar10TFJobs.html] > document (title: Running Distributed Cifar10 Tensorflow Estimator Example) > has an invalid command: > {code:java} > yarn jar path/to/hadoop-yarn-applications-submarine-3.2.0-SNAPSHOT.jar \ >job run --name tf-job-001 --verbose --docker_image > hadoopsubmarine/tf-1.8.0-gpu:0.0.1 \ >--input_path hdfs://default/dataset/cifar-10-data \ >--env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ >--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 >--num_workers 1 --worker_resources memory=8G,vcores=2,gpu=1 \ >--worker_launch_cmd "cd /test/models/tutorials/image/cifar10_estimator && > python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% > --train-steps=1 --eval-batch-size=16 --train-batch-size=16 --num-gpus=2 > --sync" \ >--tensorboard --tensorboard_docker_image wtan/tf-1.8.0-cpu:0.0.3 > {code} > The command is wrong because there are lines without backslashes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841278#comment-16841278 ] Szilard Nemeth commented on SUBMARINE-56: - Thanks [~sunilg]! > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Fix For: 0.2.0 > > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch, SUBMARINE-56.004.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840492#comment-16840492 ] Szilard Nemeth commented on SUBMARINE-67: - Hi [~pbacsko]! Thanks for your comments! I addressed all of them so please check the latest patch again! Thanks! > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-67.001.patch, SUBMARINE-67.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-79) Cleanup MockRemoteDirectoryManager
Szilard Nemeth created SUBMARINE-79: --- Summary: Cleanup MockRemoteDirectoryManager Key: SUBMARINE-79 URL: https://issues.apache.org/jira/browse/SUBMARINE-79 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth Many methods in MockRemoteDirectoryManager are declaring unnecessary throws clause for IOException. Also, there are 2 dangerous NPE candidates: 1. MockRemoteDirectoryManager#getJobStagingArea: {code:java} this.jobDir = new File(jobsParentDir.getAbsolutePath(), jobName); {code} If jobsParentDir is null, we could have an NPE easily. 2. MockRemoteDirectoryManager#getModelDir: {code:java} File modelDir = new File(modelParentDir.getAbsolutePath(), modelName); {code} If modelParentDir is null, we could have an NPE easily. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-78) Make RemoteDirectoryManager interface more consistent
Szilard Nemeth created SUBMARINE-78: --- Summary: Make RemoteDirectoryManager interface more consistent Key: SUBMARINE-78 URL: https://issues.apache.org/jira/browse/SUBMARINE-78 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth RemoteDirectoryManager contains many methods that receive a URI. The types are sometimes Strings, sometimes Path. We need to make this more consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-77) Create a builder class for NS Component
Szilard Nemeth created SUBMARINE-77: --- Summary: Create a builder class for NS Component Key: SUBMARINE-77 URL: https://issues.apache.org/jira/browse/SUBMARINE-77 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth In TensorFlowPsComponent#createComponent, we create a component object. According to [~adam.antal]'s comment given on one of my resolved jiras, this method shouts for a builder that builds the Component object. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840439#comment-16840439 ] Szilard Nemeth commented on SUBMARINE-49: - Hi [~pbacsko]! Thanks for your valuable comments! 1. Fixed, using Lists.newArrayList() for every possible occasion. 2. Changed this one to be public. 3. Fixed 4. These classes are final because of Checkstyle has a validation like "Utility class should be final" 5. Nope, it was package-private because this field is being accessed from the subclasses only. Changed it to be protected. 6. Fixed the javadoc of parseInputPath. 7. Yes, they need to be public as the tests are in a different package than the RunJobParameters class. 8. Thanks, this one makes sense as well! 9. Removed the TODO 10. I would not bother with these license text differences if you don't mind as my IDE generated these, same thing goes for previous patches on Submarine. > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Attachment: SUBMARINE-60.002.patch > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch, > SUBMARINE-60.002.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838346#comment-16838346 ] Szilard Nemeth commented on SUBMARINE-49: - Hi [~tangzhankun], [~sunilg]! Can you please check the latest patch? Adam most likely won't have time to review this today / tomorrow. > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-59) ArtifactId of Submarine yarnservice runtime is wrong
[ https://issues.apache.org/jira/browse/SUBMARINE-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-59: Attachment: SUBMARINE-59.002.patch > ArtifactId of Submarine yarnservice runtime is wrong > > > Key: SUBMARINE-59 > URL: https://issues.apache.org/jira/browse/SUBMARINE-59 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-59.001.patch, SUBMARINE-59.002.patch > > > Currently, the artifactId is "hadoop-submarine-score-yarnservice-runtime" > whereas it should be "hadoop-submarine-core-yarnservice-runtime" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-59) ArtifactId of Submarine yarnservice runtime is wrong
[ https://issues.apache.org/jira/browse/SUBMARINE-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838345#comment-16838345 ] Szilard Nemeth commented on SUBMARINE-59: - Hi [~tangzhankun]! Updated the patch with the suggested name, please check! > ArtifactId of Submarine yarnservice runtime is wrong > > > Key: SUBMARINE-59 > URL: https://issues.apache.org/jira/browse/SUBMARINE-59 > Project: Hadoop Submarine > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-59.001.patch, SUBMARINE-59.002.patch > > > Currently, the artifactId is "hadoop-submarine-score-yarnservice-runtime" > whereas it should be "hadoop-submarine-core-yarnservice-runtime" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838334#comment-16838334 ] Szilard Nemeth commented on SUBMARINE-56: - Thanks [~sunilg]! The hadolint error is not valid as it complains about the first line, but as it is the ASF license text so I think we should not bother with that. > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch, SUBMARINE-56.004.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-49: Attachment: SUBMARINE-49.005.patch > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch, > SUBMARINE-49.005.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838323#comment-16838323 ] Szilard Nemeth commented on SUBMARINE-49: - Hi [~adam.antal]! Thanks for the review! In general, the patch changed much between version 003 and 004 as PyTorch integration is merged to trunk (SUBMARINE-49), I needed to rebase this patch on top of trunk and there were many conflicts. Replying to your comments: 1. Agreed with your point on the need of defining some testcases for RoleResourceParser. Please check the testcases I added with patch005 2. As I needed to rebase this patch on top of trunk, RunJobParameters$validateWorkersAndPs is not a method anymore. 3. Added a javadoc to method RunJobParameters#setDefaultDirs. 4. Fixed the wildcard imports. Fixed the checkstyle issues (that made sense) as well resulted from patch004. > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch, SUBMARINE-49.004.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837830#comment-16837830 ] Szilard Nemeth commented on SUBMARINE-60: - Hi [~sunilg]! Yes, it was related. Fixed it with patch002. Please take a look! Thanks! > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Attachment: SUBMARINE-60.002.patch > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch, SUBMARINE-60.002.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-49) Add more test coverage to RunJobParameters
[ https://issues.apache.org/jira/browse/SUBMARINE-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837494#comment-16837494 ] Szilard Nemeth commented on SUBMARINE-49: - As SUBMARINE-52 was merged, I uploaded a rebased patch. [~adam.antal]: Will address your comments soon! > Add more test coverage to RunJobParameters > -- > > Key: SUBMARINE-49 > URL: https://issues.apache.org/jira/browse/SUBMARINE-49 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-49.001.patch, SUBMARINE-49.002.patch, > SUBMARINE-49.003.patch, SUBMARINE-49.004.patch > > > There are some good tests in > {{org.apache.hadoop.yarn.submarine.client.cli.TestRunJobCliParsing}}, but > these are not testing all fields set by method > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}}. > > Some more extensive testing is needed in this area. > As an added bonus, the code > {{org.apache.hadoop.yarn.submarine.client.cli.param.RunJobParameters#updateParametersByParsedCommandline}} > could be cleaned up a bit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837336#comment-16837336 ] Szilard Nemeth commented on SUBMARINE-52: - thanks [~sunilg] > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch, > SUBMARINE-52.008.patch, SUBMARINE-52.009.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836972#comment-16836972 ] Szilard Nemeth edited comment on SUBMARINE-52 at 5/10/19 10:11 AM: --- We have the well known build issue again.. [~sunilg]: Is there a way I can troubleshoot this problem? At least finding the root cause would be beneficial as I need to spend extra cycles every time this comes up. [~tangzhankun]: Apart from this, patch009 is ready to review! was (Author: snemeth): We have the well known build issue again.. [~sunilg]: Is there a way I can troubleshoot this problem? At least finding the root cause would be beneficial as I need to spend extra cycles every time this comes up. > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch, > SUBMARINE-52.008.patch, SUBMARINE-52.009.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836972#comment-16836972 ] Szilard Nemeth commented on SUBMARINE-52: - We have the well known build issue again.. [~sunilg]: Is there a way I can troubleshoot this problem? At least finding the root cause would be beneficial as I need to spend extra cycles every time this comes up. > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch, > SUBMARINE-52.008.patch, SUBMARINE-52.009.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-69) Add tests to ZipUtilities class
Szilard Nemeth created SUBMARINE-69: --- Summary: Add tests to ZipUtilities class Key: SUBMARINE-69 URL: https://issues.apache.org/jira/browse/SUBMARINE-69 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-68) Add tests to FileSystemOperations class
Szilard Nemeth created SUBMARINE-68: --- Summary: Add tests to FileSystemOperations class Key: SUBMARINE-68 URL: https://issues.apache.org/jira/browse/SUBMARINE-68 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Attachment: SUBMARINE-60.001.patch > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-60.001.patch > > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-60) Remove stubServiceClient from YarnServiceUtils
[ https://issues.apache.org/jira/browse/SUBMARINE-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-60: Description: We have a field in YarnServiceUtils: org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient There's a setter for this field, marked with VisibleForTesting and the test code uses this setter to set the value of this field to a mock. Then, when createServiceClient gets called from the production code, it first checks if we have this field set. If so, we return it, otherwise we create a normal app admin client. This is an anti-pattern to just have test-related fields or methods in the production code. Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only users of org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. This static could be easily replaced with a factory that receives the current Configuration object then returns the AppAdminClient. The test should inject the mock factory either via the constructor or with a setter. was: We have a field in YarnServiceUtils: org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient There's a setter for this field, marked with VisibleForTesting and the test code uses this setter to set the value of this field to a mock. Then, when createServiceClient gets called from the production code, it first checks if we have this field set. If so, we return it, otherwise we create a normal app admin client. This is an anti-pattern to just have test-related fields or methods in the production code. Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only users of org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. This static could be easily replaced with a factory that receives the current Configuration object then returns the AppAdminClient. The test should inject the mock factory either via the constructor or with a setter (constructor is preferred). > Remove stubServiceClient from YarnServiceUtils > -- > > Key: SUBMARINE-60 > URL: https://issues.apache.org/jira/browse/SUBMARINE-60 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > > We have a field in YarnServiceUtils: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#stubServiceClient > There's a setter for this field, marked with VisibleForTesting and the test > code uses this setter to set the value of this field to a mock. > Then, when createServiceClient gets called from the production code, it first > checks if we have this field set. If so, we return it, otherwise we create a > normal app admin client. > This is an anti-pattern to just have test-related fields or methods in the > production code. > > Currently, YarnServiceJobSubmitter and YarnServiceJobMonitor are the only > users of > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils#createServiceClient. > This static could be easily replaced with a factory that receives the current > Configuration object then returns the AppAdminClient. The test should inject > the mock factory either via the constructor or with a setter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-56: Attachment: SUBMARINE-56.004.patch > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch, SUBMARINE-56.004.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836328#comment-16836328 ] Szilard Nemeth commented on SUBMARINE-56: - Hi [~sunilg]! Thanks for the review! Added a reference of PyTorch support to the Index.md page. 1. Fixed the DeveloperGuide.md sentence you highlighted. 2. InstallationGuide.md: Fixed the WriteDockerfileTF.md reference with the html. 3. RunningSingleNodeCifar10PTJobs.md: Sorry, I was wrong about the mandatory-nature of checkpoint path, it's only mandatory if we are using TF with Tensorboard. 4. Fixed the reference of the docker image. Are you fine with the latest patch? > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch, SUBMARINE-56.004.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836307#comment-16836307 ] Szilard Nemeth commented on SUBMARINE-66: - patch002 fixes the unit test failure. > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-66.001.patch, SUBMARINE-66.002.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-66: Attachment: SUBMARINE-66.002.patch > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-66.001.patch, SUBMARINE-66.002.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-66: Status: Patch Available (was: In Progress) > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-66.001.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-66: Attachment: SUBMARINE-66.001.patch > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-66.001.patch > > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-66) Improve TF config env JSON generator + tests
[ https://issues.apache.org/jira/browse/SUBMARINE-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-66: Summary: Improve TF config env JSON generator + tests (was: Improve TF config env JSON generator) > Improve TF config env JSON generator + tests > > > Key: SUBMARINE-66 > URL: https://issues.apache.org/jira/browse/SUBMARINE-66 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv > generates a JSON of the TF_CONFIG environment variable. > This code could be improved to use a JSON serializer instead of hand-crafting > JSON data. > The test class of this class also could be improved: > org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator > In this class, there are some quite unreadable JSON strings, this could be > refactored to be read out from a file instead. > TestTFConfigGenerator has only one testcase and it's ugly, as we use very > long strings to verify the JSON data produced matches our expected JSON. > We should use JSON files instead of manually constructing Strings in tests, > especially if they are very long. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835501#comment-16835501 ] Szilard Nemeth commented on SUBMARINE-56: - [~sunilg]: One thing I noticed: It seems the TonY runtime document is not referenced from the main page. [~oliverhuh...@gmail.com]: How should I include a link to the QuickStart.md from the main page? Since the site MD files are in hadoop-submarine-core and TonY's quick start is in module hadoop-submarine-tony-runtime, I don't see how should we supposed to do that. Thanks! > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835495#comment-16835495 ] Szilard Nemeth commented on SUBMARINE-56: - Patch003 contains some fixes for links and replaced some lowercase "submarine" strings with "Submarine". I also did a rebase to trunk as there's no dependency on SUBMARINE-52 [~sunilg]: I generated the documentation (site) with the following commands: {code:java} mvn clean install -Pdist,submarine -DskipTests -DskipShade -Dmaven.javadoc.skip=true mvn site:site -Psubmarine {code} Then verified the documentation links. Looks good. However, there are some sentences where grammar could be improved. Should I file a separate jira for that? Thanks! > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-56: Attachment: SUBMARINE-56.003.patch > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch, > SUBMARINE-56.003.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835371#comment-16835371 ] Szilard Nemeth commented on SUBMARINE-52: - Hi [~oliverhuh...@gmail.com]! Thanks for pointing me to the documentation of TonY runtime. Are you okay if I file a follow-up jira for Tony + PyTorch and leave this patch as it is now? I don't want to increase the size of it more. Thanks! > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch, > SUBMARINE-52.008.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835348#comment-16835348 ] Szilard Nemeth commented on SUBMARINE-56: - Hi [~adam.antal]! Thanks for your valuable comments! 1. Updated the InstallationGuide.md with a link. This was a good point! Haven't touched the chinese version. [~tangzhankun]: Could you please help me with that? 2. Updated the "Please note that" section as well. 3. Added a link to the Quickstart page. That page covers all the details for CLI arguments. [~adam.antal]: Please review the latest patch! Thanks! > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-56: Attachment: SUBMARINE-56.002.patch > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch, SUBMARINE-56.002.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-67: Status: Patch Available (was: In Progress) > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-67.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-67: Attachment: SUBMARINE-67.001.patch > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: SUBMARINE-67.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned SUBMARINE-67: --- Assignee: Szilard Nemeth > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-67) Add tests to Localizer class
Szilard Nemeth created SUBMARINE-67: --- Summary: Add tests to Localizer class Key: SUBMARINE-67 URL: https://issues.apache.org/jira/browse/SUBMARINE-67 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-67) Add tests to Localizer class
[ https://issues.apache.org/jira/browse/SUBMARINE-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-67: Priority: Minor (was: Major) > Add tests to Localizer class > > > Key: SUBMARINE-67 > URL: https://issues.apache.org/jira/browse/SUBMARINE-67 > Project: Hadoop Submarine > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-56: Status: Patch Available (was: In Progress) > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-56) Update documentation to describe single-node PyTorch integration
[ https://issues.apache.org/jira/browse/SUBMARINE-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-56: Attachment: SUBMARINE-56.001.patch > Update documentation to describe single-node PyTorch integration > > > Key: SUBMARINE-56 > URL: https://issues.apache.org/jira/browse/SUBMARINE-56 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-56.001.patch > > > We should include trying out and experimenting with PyTorch on a real cluster > to document it properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831428#comment-16831428 ] Szilard Nemeth edited comment on SUBMARINE-52 at 5/2/19 6:31 AM: - Hi [~sunilg]! 1. Thanks, I hope this will be resolved soon. If you check the latest build result from jenkins, we are still encountering this issue quite often. 2. Tried to run ./start-build-env.sh in Hadoop's root, but it hangs for me with the docker run command that would start off the hadoop-build container. I had some successful docker builds before. Have you ever seen such occasion when this command hanged? 3, 4, 5, 6: I think these are no-ops at this point in time for this patch 7. I already fixed the majority of the checkstyle issues, the remainders are the ones that I didn't plan to fix as described above. As discussed with [~tangzhankun]: If this goes in, I will rebase my SUBMARINE-49 patch on top of it and could go forward with that. In the meantime, I can make some progress with the documentation. [~sunilg], [~wangda]: Ideally, this patch should be reviewed & committed this week, right? Please note that I have a day off tomorrow so the earliest time I can fix review comments will be on Monday, next week. Thanks! was (Author: snemeth): Hi [~sunilg]! 1. Thanks, I hope this will be resolved soon. If you check the latest build result from jenkins, we are still encountering this issue quite often. 2. Tried to run ./start-build-env.sh in Hadoop's root, but it hangs for me with the docker run command that would start off the hadoop-build container. I had some successful docker builds before. Have you ever seen such occasion when this command hanged? 3, 4, 5, 6: I think these are no-ops at this point in time for this patch 7. I already fixed the majority of the checkstyle issues, the remainders are the ones that I didn't plan to fix as described above. As discussed with [~tangzhankun]: If this goes in, I will rebase my SUBMARINE-49 patch on top of it and could go forward with that. In the meantime, I can make some progress with the documentation. [~sunilg], [~wangda]: Ideally, this patch should be reviewed & committed this week, right? Please note that I have a day off tomorrow so the earliest time I can fix review comments will be Monday, next week. Thanks! > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch, > SUBMARINE-52.008.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830582#comment-16830582 ] Szilard Nemeth commented on SUBMARINE-52: - Thank you very much, [~adam.antal]! > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830581#comment-16830581 ] Szilard Nemeth commented on SUBMARINE-52: - Thank you very much, [~adam.antal]! > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-66) Improve TF config env JSON generator
Szilard Nemeth created SUBMARINE-66: --- Summary: Improve TF config env JSON generator Key: SUBMARINE-66 URL: https://issues.apache.org/jira/browse/SUBMARINE-66 Project: Hadoop Submarine Issue Type: Improvement Reporter: Szilard Nemeth org.apache.hadoop.yarn.submarine.runtimes.yarnservice.tensorflow.TensorFlowCommons#getTFConfigEnv generates a JSON of the TF_CONFIG environment variable. This code could be improved to use a JSON serializer instead of hand-crafting JSON data. The test class of this class also could be improved: org.apache.hadoop.yarn.submarine.runtimes.yarnservice.TestTFConfigGenerator In this class, there are some quite unreadable JSON strings, this could be refactored to be read out from a file instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830317#comment-16830317 ] Szilard Nemeth commented on SUBMARINE-52: - Latest patch (007) fixes unused imports. > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-52: Attachment: SUBMARINE-52.007.patch > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch, SUBMARINE-52.007.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830239#comment-16830239 ] Szilard Nemeth commented on SUBMARINE-52: - Patch006 contains some modifications we agreed on offline with [~tangzhankun] [~tangzhankun]: Please review! Thanks! > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-52: Attachment: SUBMARINE-52.006.patch > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch, SUBMARINE-52.006.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830040#comment-16830040 ] Szilard Nemeth commented on SUBMARINE-52: - Patch005 fixes the checkstyle issues. > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch, > SUBMARINE-52.005.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SUBMARINE-65) Make PyTorch support HDFS
Szilard Nemeth created SUBMARINE-65: --- Summary: Make PyTorch support HDFS Key: SUBMARINE-65 URL: https://issues.apache.org/jira/browse/SUBMARINE-65 Project: Hadoop Submarine Issue Type: Sub-task Reporter: Szilard Nemeth PyTorch does not support HDFS as a datasource like TensorFlow. Found this related issue, but nothing meaningful here: https://github.com/pytorch/pytorch/issues/5867 I think we should make PyTorch support HDFS by contributing to PyTorch's source code. Here is a reference to the TF implementation of HDFS connector: https://github.com/tensorflow/tensorflow/tree/17e49b339b2b9a58ed967c69b7acb714dcd9b465/tensorflow/core/platform/hadoop Any other ideas to approach this problem are welcome! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SUBMARINE-52) Generate Service spec + launch script for single-node PyTorch learning job
[ https://issues.apache.org/jira/browse/SUBMARINE-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated SUBMARINE-52: Attachment: SUBMARINE-52.004.patch > Generate Service spec + launch script for single-node PyTorch learning job > -- > > Key: SUBMARINE-52 > URL: https://issues.apache.org/jira/browse/SUBMARINE-52 > Project: Hadoop Submarine > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: SUBMARINE-52-2.001.patch, SUBMARINE-52.001.patch, > SUBMARINE-52.002.patch, SUBMARINE-52.003.patch, SUBMARINE-52.004.patch > > > Similar to what we have for Tensorflow in > {{org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter}}, > we need a code that generates Service spec file (json) for PyTorch. > We also need to take care of the separation of CLI/YAML arguments of TF / > PyTorch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)