[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172816#comment-16172816 ] Eric Yang commented on YARN-7215: - Slider list was not robust. It take multiple seconds to respond to the query. If there are several hundred users using a UI to manage user's own application. The retrieval of information should have low range of millisecond response time. When application data are persisted in the same metastore with index and search capability, it will be easier to use the same storage mechanism to build application catalog. Although it is easy to build a view of YARN deployed applications base on computing metadata stored on HDFS and ZooKeeper. However, those services are not optimized for serving web application REST API. Let's take one step further on reducing too many small file problem on HDFS and too big z-node on ZooKeeper in consideration of the design. This will help to steer developers toward good design pattern. I am open to suggestion to list yarn applications which can survive ResourceManager restart. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172740#comment-16172740 ] Jian He commented on YARN-7215: --- To clarify a bit more, the new YARN UI service tab already list the services, it does this by passing a yarn-service type filter to the RM. I meant to implement similar thing in CLI to list services. Well, user can still get same result by passing a "yarn-service" filter using "yarn application -list" command today. I think make it more explicit with a "yarn service list" command would be more convenient to the user. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172724#comment-16172724 ] Jian He commented on YARN-7215: --- I didn't mean to make RM store the app configs in fact, RM has even no way to get the app configs, yarn-service is just an app to RM's point of view. RM can only store the metaData of YARN. Isn't this jira, by description, to implement "slider list" ? that is as simple as get the list of apps with some meta status info, which "yarn application -list" command already does today. I don't think we need a solr backend to support such simple use-case. That is also how "slider list" worked before... User should be able to simply list services without solr in the picture, just similar to listing apps. I guess you meant bigger things in YARN-7129 to index apps by configs with solr or something? If this jira is meant to implement bigger things in YARN-7129, I can probably open a separate jira to implement "yarn service list" command, which is a fairly simple patch > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172617#comment-16172617 ] Eric Yang commented on YARN-7215: - Data stored in ZooKeeper can not exceed 1MB per node. It is possible for large scale application to exceed that limit when the hostnames and config key/value pairs are stored in the state or spec file. Application state maybe fine, but I can't recommend to use ZooKeeper as low latency storage for application configuration. Ambari version 0.0 (HMS) had implemented similar use case, and it quickly hits z-node size limitation. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172482#comment-16172482 ] Jian He commented on YARN-7215: --- bq. How does RM handle a service that is in stopped state? Actually, RM today already remembers the stopped apps in ZooKeeper, it also has its own way to lookup the applications. I'm not suggesting making RM do any more reads/writes. What is the scope of this jira ? By the description, it looks only to support the old slider list, the slider was also looking up from RM, it wasn't reading from HDFS. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172434#comment-16172434 ] Eric Yang commented on YARN-7215: - [~jianhe] How does RM handle a service that is in stopped state? Stopped slider application does not have any record in resource manager. Same slider application can have multiple Application ID when the application has been restarted. Slider uses HDFS file to persist the paused application, but having resource manager to crawl through lists of HDFS directories to find stopped service seems like potential load attack to namenode. It would be better to have the operational record index, and cached by well known mechanism like a SOLR collection. This also reduces having to brew another random read/write, low latency, index, cache mechanism in YARN. Both HBase and SOLR have solved random read/write on top of HDFS with some success. It would be better to we use existing libraries that have been baked for several years than inventing something new for specialized purpose. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7215) REST API to list all deployed services by the same user
[ https://issues.apache.org/jira/browse/YARN-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172369#comment-16172369 ] Jian He commented on YARN-7215: --- Another approach is, we can simply get the list of services from RM by a type filter set to "yarn-service", in fact, I was trying to implement that but then ran into a bug YARN-7076. > REST API to list all deployed services by the same user > --- > > Key: YARN-7215 > URL: https://issues.apache.org/jira/browse/YARN-7215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications >Reporter: Eric Yang >Assignee: Eric Yang > > In Slider, it is possible to list deployed applications from the same user by > using: > {code} > slider list > {code} > This API can help UI to display application and services deployed by the same > user. > Apiserver does not have ability to list all applications/services at this > time. This API requires fast response to list all applications because it is > a common UI operation. ApiServer deployed applications persist configuration > in HDFS similar to slider, but using directory listing to display deployed > application might cost too much overhead to namenode. We may want to use > alternative storage mechanism to cache deployed application configuration to > accelerate the response time of list deployed applications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org