[ https://issues.apache.org/jira/browse/YARN-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219725#comment-16219725 ]
Jian He commented on YARN-7217: ------------------------------- Meta questions as Billie mentioned: - should solr and fs be a pluggable implementation of a common interface ? Basically, should it be either fs or solr back-end. Right now it's both there. Code comments: - For below, one constructor can call another to avoid code duplication {code} public ApiServer() { YARN_CONFIG = new YarnConfiguration(); try { if (SERVICE_CLIENT==null) { SERVICE_CLIENT = new ServiceClient(); SERVICE_CLIENT.init(YARN_CONFIG); SERVICE_CLIENT.start(); } } catch (Exception e) { LOG.error("Fail to initialize ServiceClient, ApiServer is unavailable."); } } /** * Constructor used by ResourceManager. * * @param conf */ @Inject public ApiServer(Configuration conf) { YARN_CONFIG = conf; try { if (SERVICE_CLIENT==null) { SERVICE_CLIENT = new ServiceClient(); SERVICE_CLIENT.init(YARN_CONFIG); SERVICE_CLIENT.start(); } } catch (Exception e) { LOG.error("Fail to initialize ServiceClient, ApiServer is unavailable."); } } {code} - getServicesList: it assumes solr is enabled, if not, it will throw NPE. I think we should conditionally check if solr is enabled, if not, throw exception saying only solr backend is supported for this endpoint. - similarly for getServiceSpec endpoint, it will throw NPE because ysc is null, if solr is not enabled. - updateServiceSpec: it only updates the spec in hdfs and solr, what's the use case of this api ? - similarly TestYarnNativeServices#testChangeSpec, as discussed, we won't need to restart the entire service to update the spec ? what's the use case for this ? - {{/services/{service_name}/state}} : call it status ? IMHO, state is like STOPPED or STARTED, status is more like a group of information - Should it be if solr is enabled, create the solrClient ? if solr is not enabled, there's no point creating the solrClient {code} addService(yarnClient); + createYarnSolrClient(configuration); super.serviceInit(configuration); {code} - The entire code can be inside the isEnabled block, similarly for other places {code} if (UserGroupInformation.isSecurityEnabled()) { try { userName = UserGroupInformation.getCurrentUser().getUserName(); } catch(KerberosAuthException e) { userName = null; } } if (ysc.isEnabled()) { try { ysc.deployApp(service, userName); } catch (ServiceUnavailableException e) { throw new YarnException("Fail to persist service spec."); } } {code} - we can create a common method for this, since it is used in so many places {code} if (UserGroupInformation.isSecurityEnabled()) { try { userName = UserGroupInformation.getCurrentUser().getUserName(); } catch(KerberosAuthException e) { userName = null; } } {code} - updateComponent api should also update the spec in solr ? - All services configs are currently in YarnServiceConf class, I think we can put the new configs there to not mix with the core YarnConfigurations, until the feature and config namings are stable, we can merge them back to YarnConfiguration. - ApiServerTest class is not used, remove? - ServiceClientTest: createYarnSolrClient and serviceInit are the same as in the parent class, no need to override ? - the username parameter is not used in findAppEntry API at all, but the deployApp inserts the username, then why is the username required in the first place ? - similarly, username is not used in deleteApp, then why do we need to get the username in caller in the first place - could you explain below logic ? looks like it tries to look for all entries with "id:appName" and the while loop continues until the last one is find, and return the last one . Presumbaly there's only 1 entry, then why is a while loop required? If there are multiple entries, why returning the last one ? {code} QueryResponse response; try { response = solr.query(query); Iterator<SolrDocument> appList = response.getResults() .listIterator(); while (appList.hasNext()) { SolrDocument d = appList.next(); entry = mapper.readValue(d.get("yarnfile_s").toString(), Service.class); found = true; } if (!found) { throw new ApplicationNotFoundException("Application entry is " + "not found: " + appName); } } catch (SolrServerException | IOException e) { LOG.error("Error in finding deployed application: " + appName, e); } return entry; {code} - YarnSolrClient#getEnabled is duplicated with isEnabled method. - YarnSolrClient#check, if the caller is only invoking the solrClient API if solr is enabled, then it won't enter this method in the first place, so the check is not needed? > Improve API service usability for updating service spec and state > ----------------------------------------------------------------- > > Key: YARN-7217 > URL: https://issues.apache.org/jira/browse/YARN-7217 > Project: Hadoop YARN > Issue Type: Task > Components: api, applications > Reporter: Eric Yang > Assignee: Eric Yang > Attachments: YARN-7217.yarn-native-services.001.patch, > YARN-7217.yarn-native-services.002.patch, > YARN-7217.yarn-native-services.003.patch, > YARN-7217.yarn-native-services.004.patch, > YARN-7217.yarn-native-services.005.patch > > > API service for deploy, and manage YARN services have several limitations. > {{updateService}} API provides multiple functions: > # Stopping a service. > # Start a service. > # Increase or decrease number of containers. (This was removed in YARN-7323). > The overloading is buggy depending on how the configuration should be applied. > h4. Scenario 1 > A user retrieves Service object from getService call, and the Service object > contains state: STARTED. The user would like to increase number of > containers for the deployed service. The JSON has been updated to increase > container count. The PUT method does not actually increase container count. > h4. Scenario 2 > A user retrieves Service object from getService call, and the Service object > contains state: STOPPED. The user would like to make a environment > configuration change. The configuration does not get updated after PUT > method. > This is possible to address by rearranging the logic of START/STOP after > configuration update. However, there are other potential combinations that > can break PUT method. For example, user like to make configuration changes, > but not yet restart the service until a later time. > h4. Scenario 3 > There is no API to list all deployed applications by the same user. > h4. Scenario 4 > Desired state (spec) and current state are represented by the same Service > object. There is no easy way to identify "state" is desired state to reach > or, the current state of the service. It would be nice to have ability to > retrieve both desired state, and current state with separated entry points. > By implementing /spec and /state, it can resolve this problem. > h4. Scenario 5 > List all services deploy by the same user can trigger a directory listing > operation on namenode if hdfs is used as storage for metadata. When hundred > of users use Service UI to view or deploy applications, this will trigger > denial of services attack on namenode. The sparse small metadata files also > reduce efficiency of Namenode memory usage. Hence, a cache layer for storing > service metadata can reduce namenode stress. > h3. Proposed change > ApiService can separate the PUT method into two PUT methods for configuration > changes vs operation changes. New API could look like: > {code} > @PUT > /ws/v1/services/[service_name]/spec > Request Data: > { > "name": "amp", > "components": [ > { > "name": "mysql", > "number_of_containers": 2, > "artifact": { > "id": "centos/mysql-57-centos7:latest", > "type": "DOCKER" > }, > "run_privileged_container": false, > "launch_command": "", > "resource": { > "cpus": 1, > "memory": "2048" > }, > "configuration": { > "env": { > "MYSQL_USER":"${USER}", > "MYSQL_PASSWORD":"password" > } > } > } > ], > "quicklinks": { > "Apache Document Root": > "http://httpd.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/", > "PHP MyAdmin": "http://phpmyadmin.${SERVICE_NAME}.${USER}.${DOMAIN}:8080/" > } > } > {code} > {code} > @PUT > /ws/v1/services/[service_name]/state > Request data: > { > "name": "amp", > "components": [ > { > "name": "mysql", > "state": "STOPPED" > } > ] > } > {code} > SOLR can be used to cache Yarnfile to improve lookup performance and reduce > stress of namenode small file problems and high frequency lookup. SOLR is > chosen for caching metadata because its indexing feature can be used to build > full text search for application catalog as well. > For service that requires configuration changes to increase or decrease node > count. The calling sequence is: > {code} > # GET /ws/v1/services/{service_name}/spec > # Change number_of_containers to desired number. > # PUT /ws/v1/services/{service_name}/spec to update the spec. > # PUT /ws/v1/services/{service_name}/state to stop existing service. > # PUT /ws/v1/services/{service_name}/state to start service. > {code} > For components that can increase node count without rewrite configuration: > {code} > # GET /ws/v1/services/{service_name}/spec > # Change number_of_containers to desired number. > # PUT /ws/v1/services/{service_name}/spec to update the spec. > # PUT /ws/v1/services/{service_name}/component/{component_name} to change > node count. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org