[ 
https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518163#comment-14518163
 ] 

Zhijie Shen commented on YARN-3539:
-----------------------------------

Steve, thanks for consolidating the patch. Here're some of my comments and 
thoughts.

bq. What is essential is that all the existing operations must not change, so 
that shipping applications do not break.

Yeah, we can retain v1 APIs (this is actually we're doing now), but problem is 
around "do not break". Does it mean ATS v2 should be compatible with v1 APIs? 
In other word, do we support that user's old app uses v1 client to talk to v2 
server?

bq. is is critical to declare that ATSv1 is stable. Without that guarantee, it 
is impossible for any application to commit to using the APIs. 
bq. Spark depends on this for the SPARK-1537 feature, some ongoing worth with 
Accumulo depends on this, when Slider adds ATS support we'll depend on this 
stability guarantee, etc, etc.

I pretty understand the desirability of stable APIs. However, I can see TEZ and 
Hive/Pig on TEZ started integrating the service even without our declaring the 
APIs stable. Though the APIs is not declared as stable, it didn't mean we're 
keeping changing it from release to release. Instead, the reality is that the 
timeline API is almost compatible since 2.4. Marking it as unstable before is 
more like reserving the right to change it for improving the service. So I'm 
not sure if it's good timeline now, as we foresee in the near future, we're 
going to be upgraded to ATS v2, which may significantly refurnish the APIs.

bq. One area that is not covered in the ATSv1 API is what constitutes a valid 
entity type or domain?.

Do you mean the mandatory fields? For entity, they're type, id and starttime 
(which can be optional if the entity containsn at least one event). For event, 
they are type and timestamp. For domain, they're id.

bq. There is also the fact that the /domain path was added under 
/ws/v1/timeline/, so matches the path of entity types. Can you have an entity 
type called "domain"? Was it previously possible?

We cannot. "timeline/domain" blocks the entity type "domain" after domain 
feature is added. I think we should state it in the documentation (perhaps we 
wan't to reserve more names for future use). Other than this, I think we 
shouldn't have any other obligation for naming the identifier.

bq. strictly defining what constitutes a valid entity type via a regular 
expression, and declaring whether the types are case sensitive.

This is a good idea. We can define the char set and the pattern to prevent 
users to define random names, but I'm not sure if it is easy to put into 
practice. The question is whether we're going to break the existing users who 
have already defined the names that won't match our future regex.


Some comments about the patch:

1. For the bullet points of "Current Status and Future Plans", can we organize 
them a bit better. For example, we partition them into the groups of  a) 
current status and b) future plans. For bullet 4, not just history, but all 
timeline data.

2. Can we move "Timeline Server REST API" section before "Generic Data REST 
APIs"?

3. Application elements table seems to be wrongly formatted. I think that's why 
site compilation is failed.

4. "Generic Data REST APIs" output examples need to be slightly updated. Some 
more fields are added or changed.

5. "Timeline Server REST API" output examples are not genuine. Perhaps, we can 
run a simple MR example job, and get the up-to-date timeline entity and 
application info to show as the examples.

One additional stuff that is not covered by the documentation is the entity 
uniqueness. In v1, an entity is globally identified by <type, id>. It means if 
user1 has posted <type1, id1> in his application, user2 cannot pos the entity 
with the same identifier in his application even they're completely irrelevant. 
Therefore, users are suggested to come up with unique entity type for their 
framework to avoid the namespace collision.



> Compatibility doc to state that ATS v1 is a stable REST API
> -----------------------------------------------------------
>
>                 Key: YARN-3539
>                 URL: https://issues.apache.org/jira/browse/YARN-3539
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 2.7.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch, 
> YARN-3539-003.patch, YARN-3539-004.patch
>
>
> The ATS v2 discussion and YARN-2423 have raised the question: "how stable are 
> the ATSv1 APIs"?
> The existing compatibility document actually states that the History Server 
> is [a stable REST 
> API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs],
>  which effectively means that ATSv1 has already been declared as a stable API.
> Clarify this by patching the compatibility document appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to