[ https://issues.apache.org/jira/browse/SLIDER-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gour Saha updated SLIDER-1187: ------------------------------ Fix Version/s: (was: Slider 1.0.0) Slider 0.92 > Create app diagnostics resource with placeholder for containers (live/dead) > --------------------------------------------------------------------------- > > Key: SLIDER-1187 > URL: https://issues.apache.org/jira/browse/SLIDER-1187 > Project: Slider > Issue Type: Sub-task > Components: appmaster, client > Affects Versions: Slider 0.91 > Reporter: Gour Saha > Assignee: Gour Saha > Fix For: Slider 0.92 > > Attachments: SLIDER-1187.001.patch, SLIDER-1187.002.patch, > SLIDER-1187.003.patch, SLIDER-1187.004.patch > > > This is a sample JSON structure of the proposed diagnostics resource - > {code} > { > "finalStatus": "SUCCEEDED", > "finalMessage": "stop command issued", > "containers": [ > { > "containerId": "container_e3374_1485226679409_0016_01_000004", > "component": "COMMAND_LOGGER", > "appVersion": "1.0.0", > "state": 3, > "exitCode": -1000, > "diagnostics": "", > "createTime": 1485285533968, > "startTime": 1485285533989, > "host": "cn008.l42scl.hortonworks.com", > "hostURL": "http://cn008.l42scl.hortonworks.com:8042", > "logLink": > "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn008.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000004/ctx/root" > }, > { > "containerId": "container_e3374_1485226679409_0016_01_000003", > "component": "COMMAND_LOGGER", > "appVersion": "1.0.0", > "state": 3, > "exitCode": -1000, > "diagnostics": "", > "createTime": 1485285120456, > "startTime": 1485285120723, > "host": "cn005.l42scl.hortonworks.com", > "hostURL": "http://cn005.l42scl.hortonworks.com:8042", > "logLink": > "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn005.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000003/ctx/root" > }, > { > "containerId": "container_e3374_1485226679409_0016_01_000002", > "component": "COMMAND_LOGGER", > "appVersion": "1.0.0", > "state": 4, > "exitCode": -100, > "diagnostics": "Container released by application", > "createTime": 1485285120464, > "startTime": 1485285120522, > "host": "cn008.l42scl.hortonworks.com", > "hostURL": "http://cn008.l42scl.hortonworks.com:8042", > "logLink": > "http://cn007.l42scl.hortonworks.com:19888/jobhistory/logs/cn008.l42scl.hortonworks.com:45454/container_e3374_1485226679409_0016_01_000002/ctx/root" > } > ] > } > {code} > API consumers will need to call _*SliderClient#actionDiagnosticContainers*_ > API to get the _*ApplicationDiagnostics*_ object. This object has 3 > attributes - > # *finalStatus* - app-level status which is empty for a running app (of type > _org.apache.hadoop.yarn.api.records.FinalApplicationStatus_) > # *finalMessage* - app-level summary message which is populated after the app > dies > # *containers* - a set of all currently running and all previously failed > containers (type _org.apache.slider.api.types.ContainerInformation_) > Note, it also contains an additional helper method _getContainer(String > containerId)_ which will return the _ContainerInformation_ for a specific > container if the container-id is known. > _*ContainerInformation*_ (for each running or dead container) contains > several attributes which gets updated as and when a container transitions > through various stages - like newly created, running, dead, etc. Following > are the attributes - > - containerId > - component > - appVersion > - released (true/false) > - state (of type org.apache.slider.api.StateValues) > - exitCode (of type org.apache.hadoop.yarn.api.records.ContainerExitStatus) > - diagnostics (container level diagnostics message) > - createTime > - startTime > - host > - hostURL > - placement > - output (empty so don't use) > - logLink (container log link for a live as well as a dead container) > h6. For an app which is still RUNNING - > _ApplicationDiagnostics_ object can be retrieved at any point in the app's > lifetime by calling the > _*SliderClient#actionDiagnosticContainers(ActionDiagnosticArgs > diagnosticArgs)*_ API with only the name field in _ActionDiagnosticArgs_ set > to the application name. It can be retrieved on the command-line by calling > the *diagnostics* command with the following arguments - > {code} > slider diagnostics --name <app-name> --containers > {code} > On the command-line it is dumped in JSON format. > h6. For an app which is FAILED/KILLED - > The _ApplicationDiagnostics_ object is set as YARN application diagnostics > and can be retrieved by YARN API or through *application* command line like - > {code} > yarn application -status <application_id> > {code} > Note, the _ApplicationDiagnostics_ object (in JSON format) can also be viewed > in RM UI of the application in the *Diagnostics:* field. > To retrieve using YARN Client API, this JSON string can be retrieved by > calling _*YarnClient#getApplicationReport(ApplicationId appId)*_ to get the > _ApplicationReport_ and then subsequently calling > _*ApplicationReport#getDiagnostics*_. This JSON string can then be easily > converted to the Slider _ApplicationDiagnostics_ object by calling the static > method _*ApplicationDiagnostics#fromJson(String json)*_. -- This message was sent by Atlassian JIRA (v6.3.15#6346)