[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369772#comment-15369772 ] Hudson commented on YARN-3040: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-3040. Make putEntities operation be aware of the app's context. (sjlee: rev d67c9bdb4db2b075484a779802ecf3296bad5cd4) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/GetTimelineCollectorContextRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/CollectorNodemanagerProtocolPBServiceImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/TestTimelineServiceClientIntegration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FileSystemTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/collector/TestTimelineCollectorManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/GetTimelineCollectorContextResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestFileSystemTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/CollectorNodemanagerProtocolPBClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/collector/TestPerNodeTimelineCollectorsAuxService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/GetTimelineCollectorContextResponsePBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/GetTimelineCollectorContextRequestPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/CollectorNodemanagerProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/collectornodemanager_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/Application.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/timelineservice/RMTimelineCollector.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorWebService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/AppLevelTimelineCollector.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382226#comment-14382226 ] Sangjin Lee commented on YARN-3040: --- Thanks much [~zjshen]! > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382191#comment-14382191 ] Junping Du commented on YARN-3040: -- OK. I have commit v6 patch to branch YARN-2928. Thanks [~zjshen] for contributing the patch, and review comments from [~sjlee0], [~vinodkv], [~gtCarrera9], [~kasha] and [~Naganarasimha]! > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382152#comment-14382152 ] Zhijie Shen commented on YARN-3040: --- Sure, let's return null for now. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381908#comment-14381908 ] Junping Du commented on YARN-3040: -- Sounds like there is a build failure for v5 patch: RMTimelineCollector (just added in YARN-3034) need to override abstract method getTimelineEntityContext() in TimelineCollector. Given there is YARN-3390 to track this issue separately, I think we can simply add a quick method (like return null) to RMTimelineCollector like v6 patch shows. [~zjshen], can you confirm this? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch, YARN-3040.6.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380880#comment-14380880 ] Junping Du commented on YARN-3040: -- bq. I am +1 for having a stable default value if users didn't set it. The use case we're trying to address is a single small cluster setup. If there is a timestamp in the default value or it is randomly generated, each time the cluster starts up, a new logical cluster is created in the timeline service, which is not desirable. This is what we have in v5 patch now which is also not against any expected behaviors of HA cluster. YARN-3399 is supposed to be an improvement, but not supposed to break key assumptions here. However, it could be better to keep an eye on it. bq. For the most part, we do want users to set this value; e.g. "production", "ad-hoc", "science", etc. Agree. Users can also put some necessary domain prefix/suffix if they want, i.e. "org.A.production-1", "com.B.test-2", etc. Any other concerns? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380837#comment-14380837 ] Sangjin Lee commented on YARN-3040: --- I am +1 for having a *stable* default value if users didn't set it. The use case we're trying to address is a single small cluster setup. If there is a timestamp in the default value or it is randomly generated, each time the cluster starts up, a new logical cluster is created in the timeline service, which is not desirable. For the most part, we do want users to set this value; e.g. "production", "ad-hoc", "science", etc. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380816#comment-14380816 ] Junping Du commented on YARN-3040: -- To be clear, the behavior for without setting cluster-id for RM-HA case is still the same on v5 patch: EmbeddedElectorService will get failed with exception. Given we already have a separated JIRA to track this issue, there is no reason to pending on this. +1 on v5 patch. I will go ahead to commit it today unless some immediately comments comes. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380731#comment-14380731 ] Zhijie Shen commented on YARN-3040: --- bq. if cluster-id is set, and if not, generate one like cluster- without having a default value per se? I originally propose this way, but it turns out to be difficult to coordinate among multiple RMs and over RM restarting. IAC, let's file a separate Jira to continue the discussion of default cluster ID for RM HA: YARN-3399. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380635#comment-14380635 ] Junping Du commented on YARN-3040: -- bq. How about we check if cluster-id is set, and if not, generate one like cluster- without having a default value per se? That make sense. However, I think we should fix it in trunk branch. May be we should file a separated JIRA to track that? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380618#comment-14380618 ] Karthik Kambatla commented on YARN-3040: When HA is enabled, conflicting cluster-ids could lead to RMs from different clusters participating in leader election. How about we check if cluster-id is set, and if not, generate one like cluster- without having a default value per se? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380578#comment-14380578 ] Zhijie Shen commented on YARN-3040: --- [~vinodkv], thanks for pointing the right jira. Per my previous [comment|https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12766984&commentId=14380443], hopefully it makes sense to you the timeline service needs the default cluster ID. As the last patch doesn't affect logic of HA using the cluster ID config, we can separate the discussion if HA needs the default cluster ID or not in a separate thread to unblock this work. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380537#comment-14380537 ] Vinod Kumar Vavilapalli commented on YARN-3040: --- The right JIRA is YARN-1029. See my comment [here|https://issues.apache.org/jira/browse/YARN-1029?focusedCommentId=13861990&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13861990] and the next discussion with Karthik. I think I missed the point about randomly generating "unique-enough" ClusterIDs there. That should alleviate some concerns of a default. /cc [~kasha] > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380443#comment-14380443 ] Zhijie Shen commented on YARN-3040: --- First, in our use case, we'd like to have a default cluster ID, which is going to be part of PK to identify the entity object. By having a default, we can make users to use the timeline service with a simple setup, just as we also don't require users to come up with a flow ID if it's just an orphan app. Second, I checked YARN-986 and YARN-1232. It seems not to be clear why it doesn't make a default before. Anyway to make sure the default doesn't affect HA logic, I remove the default from yarn-default.xml. Since HA uses YarnConfiguration#getClusterId, it's separated from the code path we use the default cluster ID. We can revisit if HA want to have the default cluster ID set if users don't provide one. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch, YARN-3040.5.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380194#comment-14380194 ] Junping Du commented on YARN-3040: -- Thanks [~zjshen] for updating the patch! bq. I also agree "yarn.cluster.id" sounds better, but "yarn.resourcemanager.cluster-id" is the legacy name, which is used by RM HA for a while. As it's not sound so bad, how about keeping it, such that we don't need to deprecate config or break compatibility. Yes. It should be fine if we are reusing the "yarn.resourcemanager.cluster-id". Raising a question here is: it looks like we don't set any default value to "yarn.resourcemanager.cluster-id" in HA case and YARN will complain the value not set properly in HA service. Do we need to follow that practice? I guess no but just raising it for attention. v4 patch looks pretty good to me. If no concern on issue above or other further comments, I will go ahead to commit it later. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378946#comment-14378946 ] Sangjin Lee commented on YARN-3040: --- bq. It will check if the tag starts with "TIMELINE_FLOW_ID_TAG:", and then if the value is empty, "TIMELINE_FLOW_ID_TAG:".substring("TIMELINE_FLOW_ID_TAG".length() + 1) will return an empty value. It shouldn't throw IndexOutOfBoundsException. But it seems there's no need to add an empty env, I'll change the code accordingly. Ack. I was thrown off because the code was like {code} if (tag.startsWith(TAG + ":")) { String value = tag.substring(TAG.length() + 1); } {code} It works because the "+1" is really for the semi-colon. LGTM overall. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch, > YARN-3040.4.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378666#comment-14378666 ] Zhijie Shen commented on YARN-3040: --- Thanks for review, Sangjin and Junping! I've updated the patch accordingly. bq. I am comfortable with continuing to work on the flow-related items in the separate JIRA. Thanks. This sounds good. bq. I'm not sure of these set calls. Are these here just to initialize the context to default values? Yes, these are the defaults. In fact, "user" will sure to be update by the rpc call to get the context info (unless there's bug in the RPC). The current user for initialization is usually not correct, but I kept to have a value to ensure we have something to pass to the storage to prevent possible NPE that will crush the process. Instead, we can easily debug/inspect the storage to verify the user if bug occurs. I add some code comments for the initialization. bq. I would prefer something like "yarn.cluster.id" because this id is for identifying YARN cluster rather than ResourceManager. I also agree "yarn.cluster.id" sounds better, but "yarn.resourcemanager.cluster-id" is the legacy name, which is used by RM HA for a while. As it's not sound so bad, how about keeping it, such that we don't need to deprecate config or break compatibility. bq. Can we add a test case that without specifying flow_id and flow_run_id and v2 timeline service still can work? Added the test case in the new patch bq. Do we need to be case-insensitive here? I think we can be strict about the tag names? This is because the tag text has case sensitive and insensitive mode. When insensitive, even if user inputs the upper case strings, it will be normalized to lower case strings. So we need to take care this case. bq. You might want to be bit defensive about the tag not carrying any value (e.g. "TIMELINE_FLOW_ID_TAG:"). It will check if the tag starts with "TIMELINE_FLOW_ID_TAG:", and then if the value is empty, {{"TIMELINE_FLOW_ID_TAG:".substring("TIMELINE_FLOW_ID_TAG".length() + 1)}} will return an empty value. It shouldn't throw IndexOutOfBoundsException. But it seems there's no need to add an empty env, I'll change the code accordingly. In addition, I fixed a couple test failure in the new patch. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377984#comment-14377984 ] Junping Du commented on YARN-3040: -- Some additional comments: {code} +The YARN cluster ID. +yarn.resourcemanager.cluster-id +yarn_cluster + {code} I would prefer something like "yarn.cluster.id" because this id is for identifying YARN cluster rather than ResourceManager. It should keep consistent across RMs (active and standby) get switch over. Also other names like: RM_CLUSTER_ID, DEFAULT_RM_CLUSTER_ID, we should use YARN_CLUSTER_ID instead. {code} @@ -208,7 +211,11 @@ public void testDSShell(boolean haveDomain, String timelineVersion) if (timelineVersion.equalsIgnoreCase("v2")) { String[] timelineArgs = { "--timeline_service_version", - "v2" + "v2", + "--flow", + "test_flow_id", + "--flow_run", + "12345678" }; {code} Can we add a test case that without specifying flow_id and flow_run_id and v2 timeline service still can work? In my understanding, these info will still be optional for applications. So we should make sure these info is nullable in launching applications and other following flows. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376749#comment-14376749 ] Sangjin Lee commented on YARN-3040: --- Thanks [~zjshen] for the updated patch! I am comfortable with continuing to work on the flow-related items in the separate JIRA. I'll jot down the key points in that JIRA shortly. I went over the latest patch, and overall it looks good. I do have a few comments: (AppLevelTimelineCollector.java) {code} 50protected void serviceInit(Configuration conf) throws Exception { 51 context.setClusterId(conf.get(YarnConfiguration.RM_CLUSTER_ID, 52 YarnConfiguration.DEFAULT_RM_CLUSTER_ID)); 53 context.setUserId(UserGroupInformation.getCurrentUser().getShortUserName()); 54 context.setFlowId(TimelineUtils.generateDefaultFlowIdBasedOnAppId(appId)); 55 context.setFlowRunId("0"); 56 context.setAppId(appId.toString()); {code} I'm not sure of these set calls. Are these here just to initialize the context to default values? For example, UGI.getCurrentUser().getShortUserName() would return the user under which the daemon was started (whether it is NM or a standalone daemon) in case of a per-node daemon, which is highly likely to be incorrect. Do we need to bother setting default values if they are going to be incorrect anyway, for example, for user? At minimum, it would be helpful to have a comment here why this is being done. (AMLauncher.java) - Do we need to be case-insensitive here? I think we can be strict about the tag names? - You might want to be bit defensive about the tag not carrying any value (e.g. "TIMELINE_FLOW_ID_TAG:"). If the value is empty, tag.substring() would throw an IndexOutOfBoundsException. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376602#comment-14376602 ] Junping Du commented on YARN-3040: -- Hi [~zjshen], thanks for the patch! I am still reviewing the patch but have some quick comments so far: {code} + public static String generateDefaultClusterIdBasedOnAppId( + ApplicationId appId) { +return "cluster_" + appId.getClusterTimestamp(); + } {code} It seems appId's ClusterTimestamp comes from RM and get changed everytime RM get restart. I think here we need a ClusterID that can keep consistent across from RM restarts. Isn't it? Or applications get submitted to the same cluster could get different ClusterID just because RM failed over which shouldn't be users' expectation. Suggest to add a configuration for user to input a specified ClusterID or it generate default (and variable) value for test purpose. {code} + rpc getTimelienCollectorContext (GetTimelineCollectorContextRequestProto) returns (GetTimelineCollectorContextResponseProto); {code} One typos here and other places, "Timelien" should be "Timeline". {code} -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.Vector; +import java.util.*; {code} We shouldn't do this which could load unnecessary classes. {code} + * The aggregator needs to get the context information including user, flow {code} aggregator => collector > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376400#comment-14376400 ] Zhijie Shen commented on YARN-3040: --- [~sjlee0], thanks for more comments, but would you mind continuing the flow attributes discussion in YARN-3391 to unblock this jira? In this jira, how about focusing on the data flow to passing this context info to the collector? For flow info, no matter what it should be specifically, this patch works out the path to collect it from user via application submission context and pass it to RM, NM and finally to the collector. If we're okay with is approach. It is easy for us to add new flow info or correct existing flow info later on. I filed YARN-3391 to fork the flow related discussion. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376285#comment-14376285 ] Sangjin Lee commented on YARN-3040: --- {quote} I can understand this particular case described above. Like my prior comment about flow run ID, my concern is whether flow/version/run's explicit hierarchy is so general to capture most use cases. IMHO, by nature, the hierarchy is the tree of flows, and a flow can be the flow of flows or the flow of apps. However, if other users just want to use one level of flow, version/run info seems to be redundant. On the other side, if use the flow recursion structure, it's elastic to have flow levels from one to many. We can treat the first level as the flow, the second as version and third and run. I don't have expertise knowledge about workflow such as Oozie, but just want to think out my concern loudly. That said, if flow/version/run is the general description of a flow, I agree we should pass in these three env vars together and separately. {quote} Agreed that we need to consider both use cases (single level and multi-level). I just want to clarify that even with one level of flows, it is possible (and in fact it is more common) that there are multiple runs for a given flow version, and multiple version for a given flow name; e.g. "foo.pig"/"v.1"/1, "foo.pig"/"v.1"/2, ..., "foo.pig"/"v.2"/10, "foo.pig"/"v.2"/11, ... Also, my mental model is that flow id/version/run-id is not a hierarchy. It's just a group of 3 attributes (although there is some implied contains relationship). Also, when we store these 3 attributes in the storage, I suspect schemas like HBase/phoenix will probably make only the flow id (name) and the flow run id as part of the primary/row key, and store the flow version in a separate table. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376280#comment-14376280 ] Sangjin Lee commented on YARN-3040: --- {quote} I can understand this particular case described above. Like my prior comment about flow run ID, my concern is whether flow/version/run's explicit hierarchy is so general to capture most use cases. IMHO, by nature, the hierarchy is the tree of flows, and a flow can be the flow of flows or the flow of apps. However, if other users just want to use one level of flow, version/run info seems to be redundant. On the other side, if use the flow recursion structure, it's elastic to have flow levels from one to many. We can treat the first level as the flow, the second as version and third and run. I don't have expertise knowledge about workflow such as Oozie, but just want to think out my concern loudly. That said, if flow/version/run is the general description of a flow, I agree we should pass in these three env vars together and separately. {quote} Agreed that we need to consider both use cases (single level and multi-level). I just want to clarify that even with one level of flows, it is possible (and in fact it is more common) that there are multiple runs for a given flow version, and multiple version for a given flow name; e.g. "foo.pig"/"v.1"/1, "foo.pig"/"v.1"/2, ..., "foo.pig"/"v.2"/10, "foo.pig"/"v.2"/11, ... Also, my mental model is that flow id/version/run-id is not a hierarchy. It's just a group of 3 attributes (although there is some implied contains relationship). Also, when we store these 3 attributes in the storage, I suspect schemas like HBase/phoenix will probably make only the flow id (name) and the flow run id as part of the primary/row key, and store the flow version in a separate table. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376255#comment-14376255 ] Sangjin Lee commented on YARN-3040: --- bq. I can see the benefit. For example, if it represents the timestamp, we can filter the flow runs and say give me the runs in the last 5 mins. But my concern is whether it's the general way to let user to describe a run. The design doc says the flow runs for a given flow must have "unique and totally ordered run identifiers". We obviously had numbers in mind when we had that (mostly coming from the ease of sorting and ordering in the storage). And that's the convention we will push frameworks to use. I think it is important that we make it a number (long). However, there is a difference between having numbers as run id's and having timestamps as run id's. I don't think we need to go so far as requiring timestamps as run id's. As long as they are numbers, I think it would be fine. I can imagine some flows using run id's like "1", "2", ... We could allow any arbitrary scheme to generate the run id's, but the challenge is it might seriously hamper the ability to store and sort them efficiently. And, in most cases, the timestamp of the flow start is a quite natural scheme, and I would think most frameworks will just adopt that scheme. What do you think? On a related note, we should also generate the default run id if it is missing. I realize this could be bit tricky. If the flow id is also missing, then we're treating this single YARN app as a flow in and of itself. Then we can do flow/version/run id = (yarn app name)/("1")/(app submission timestamp). This is also mentioned in the design doc. However, if the flow id is provided but not the flow run id, it can be tricky as there can be multiple YARN apps for the given flow run. One obvious solution might be to reject app submission if the flow client (not the timeline client) sets the flow id but not the flow run id. For that we'd need some kind of a common layer for checks. Thoughts? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376169#comment-14376169 ] Zhijie Shen commented on YARN-3040: --- bq. It sounds not quite scalable if we have one client for each app in the RM... In RM/NM, I think we can and we should implement a wrapper layer, which may contain multiple applications, to have delegator to write the data for multiple applications. bq. One most significant advantage to have run ids as integers is we can easily sort all existing runs for one flow in ascending or descending order. This might be a solid use case in general? I can see the benefit. For example, if it represents the timestamp, we can filter the flow runs and say give me the runs in the last 5 mins. But my concern is whether it's the general way to let user to describe a run. bq. Hmm, I didn't think the version as part of the flow id. I can understand this particular case described above. Like my prior comment about flow run ID, my concern is whether flow/version/run's explicit hierarchy is so general to capture most use cases. IMHO, by nature, the hierarchy is the tree of flows, and a flow can be the flow of flows or the flow of apps. However, if other users just want to use one level of flow, version/run info seems to be redundant. On the other side, if use the flow recursion structure, it's elastic to have flow levels from one to many. We can treat the first level as the flow, the second as version and third and run. I don't have expertise knowledge about workflow such as Oozie, but just want to think out my concern loudly. That said, if flow/version/run is the general description of a flow, I agree we should pass in these three env vars together and separately. bq. Mostly fine, but I have some concerns about rolling upgrades. bq. I'm still not sure why it would make sense to have different logical cluster id's every time the RM/cluster restarts. I meant the admin can configure a cluster ID explicitly, which won't be appended with the timestamp. I added it for the default value to distinguish the clusters that are started by you and me, but I think about it again, and it seems that RM restarting problem makes sense. I'll change the default not to append timestamp. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372412#comment-14372412 ] Sangjin Lee commented on YARN-3040: --- [~zjshen], thanks for your updated patch and prompt answers! I'll go over the new patch in some more detail, and get back to you. I haven't looked at the patch just yet, and therefore I might be saying something dumb, but I thought I'd reply to some of your points. Hopefully this will move things forward. bq. RM will have all the above context info. When constructing and starting RM collector, we should make sure it be setup. Since RM's collector will handle multiple applications, there is no one-to-one relationship between flow/flow-run/app and an instance of the RM collector. RM will just have to retain that information in memory for multiple apps, and pass that along on a per-call basis to the storage. bq. Personally, I prefer to user ID to be uniform among the all the context properties. ID indicates it can be used to identify a flow. I'm OK with "flow id" if it increases consistency. bq. I thought version is part of flow id. I think we can revisit it once the schema is done, and we finalized the generic description about the flow structure and the notation. So far I'd like to keep it as what it is now. Thoughts? Hmm, I didn't think the version as part of the flow id. Here we're thinking bit ahead to the storage and query aspects of it, but it's perfectly feasible to ask questions like "give me the latest 10 runs of the flow named 'foo.pig'". Note that those latest 10 runs can have different versions. This implies there needs to be a semantic differentiation between the flow id (name) and the flow version. Namely, in this query the flow version is *not* used to retrieve the last 10 runs. So I would advocate having a separate field/attribute named "flow version" from "flow id". As for the run id being numeric, as Li alluded to it, there is a significant advantage in having run id's as numbers (longs really) as it lends itself to super-easy sorting. It's a little bit of storage concern leaking to the higher level abstraction, but it's a strong reason to qualify it as a number IMO. bq. It makes sense, but when RM restarts we use the new start time of RM to identify the app instead of the one before. In current way, cluster_xyz will contain the application_xyz_123. This was my rationale before. And this default cluster id construction is only used in the case the user didn't specify the cluster id in config file. In production, user should specify one. I'll thought about the question again. I'm still not sure why it would make sense to have different logical cluster id's every time the RM/cluster restarts. Logically, a single cluster should be identified by a long-lived name. For example, UIs will be built on questions like "give me top 10 flows on cluster ABC". Queries like that surely wouldn't care about cluster restarts. As for the default value, in fact I would imagine most use cases would not set the cluster id (just assuming the cluster default would be filled in). That would be the norm, not the exception. Hope these help... > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372384#comment-14372384 ] Li Lu commented on YARN-3040: - Hi [~zjshen], some quick thoughts... bq. It sounds like each NM will need to have multiple timeline clients (one for each application). bq. That's correct. bq. The RM will have its own collector, and it does not go through the TimelineClient API. How would that work? bq. RM will have all the above context info. When constructing and starting RM collector, we should make sure it be setup. For both RM and NMs, they are posting predefined "application history info", but not "generic" (I'm trying to use the wording in ATS v1 but correct me if I'm wrong.). I'm thinking the if it's possible to have another client implement, based on our existing implement, that can handle multiple applications within the same client? It sounds not quite scalable if we have one client for each app in the RM... bq. I thought version is part of flow id. I think we can revisit it once the schema is done, and we finalized the generic description about the flow structure and the notation. So far I'd like to keep it as what it is now. Thoughts? One most significant advantage to have run ids as integers is we can easily sort all existing runs for one flow in ascending or descending order. This might be a solid use case in general? bq. It makes sense, but when RM restarts we use the new start time of RM to identify the app instead of the one before. In current way, cluster_xyz will contain the application_xyz_123. This was my rationale before. And this default cluster id construction is only used in the case the user didn't specify the cluster id in config file. In production, user should specify one. I'll thought about the question again. Mostly fine, but I have some concerns about rolling upgrades. With rolling upgrades, if we're not specifying cluster ids explicitly, applications that live across an upgrade will have two different primary keys. Even though we may merge this in our reader (which still sounds suboptimal), this may pose a challenge to our aggregators (data will be aggregated to two different entities across time). Any suggestions on this? > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch, YARN-3040.2.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371899#comment-14371899 ] Sangjin Lee commented on YARN-3040: --- Hi [~zjshen], thanks much for working on this. I just took a quick look at the patch and the discussion. It seems like you'll update it soon, but I'll pass along my comments just in case. One high level comment: the original intent of this JIRA is more of an end-to-end flow of the flow information (flow name, flow version, and flow run id). How can individual frameworks (MR, tez, ...) set these attributes and pass them to the RM at the time of the application launch? How does that information get passed to the TimelineClient and to the timeline collector? We do need the API from the beginning portion of the end-to-end picture as well. bq. new TimelineClient is constructed per application, and in the context of one application, we can reasonably assume this context information should be unchanged. There are a couple of things to consider here (and it sounds like that may be part of the offline discussion). We need to make sure we handle the case of NM's writing container-related info. It sounds like each NM will need to have multiple timeline clients (one for each application). More importantly, we need to think about the RM use case. The RM will have its own collector, and it does not go through the TimelineClient API. How would that work? More individual comments: - flowId should be flowName (that's the standard terminology we're using) - flow version seems to be missing from this; while flow version is not part of the primary key of the entity, it is a necessary attribute - I think flow run id can (and should) be a long; it doesn't have to be a generic string - in light of this, it might be slightly better to have a (flow) context API rather than individual arguments where you can set all these flow-related attributes - the default cluster id should be just the cluster name; I'm not sure why we need to add the cluster start timestamp; it would mean that every restart of the resource manager would create a new logical cluster in the timeline service; I'm not sure I agree with that - hopefully isUnitTest can be removed with the changes I made in the previous commit > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370432#comment-14370432 ] Robert Kanter commented on YARN-3040: - Sorry I didn't reply earlier. I still haven't found the cycles to do patches for ATS work, so please go ahead and continue working on the updated patch. I'll be sure to review it. > [Data Model] Make putEntities operation be aware of the app's context > - > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Zhijie Shen > Attachments: YARN-3040.1.patch > > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)