[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919222#comment-16919222 ] Abhishek Modi commented on YARN-5281: - We are already supporting Hbase and CosmosDB as backend. We also support local/hdfs filesystem with limited capabilities. Based on the discussion, I don't think it would be possible to support simpler backend with all functionalities without re-implementing some part of the features provided by Hbase/CosmosDB. For single node setup, ATSv2 can still be used with limited functionalities using local filesystem as backend. If no one is actively working on this, I would close this as part of Jira cleanup for ATSv2. cc [~vrushalic]/[~rohithsharma] > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Priority: Major > Labels: YARN-5355 > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344639#comment-15344639 ] Sangjin Lee commented on YARN-5281: --- If we're open to the limitation of a single node setup, there could be more options (levelDB, etc.). In the future, we are definitely open to production-ready implementations that can be alternatives to the HBase implementation. We tried to make the storage as pluggable as possible, and probably we could do more to facilitate this down the road. As you mentioned, we could also consider having configurations that turn off some features (aggregation or some type of queries) if they turn out to be too challenging for certain implementations. But that would need to come with the explicit acknowledgment that one would lose some functionalities by doing so. For example, if flow run aggregation is turned off, users need to know what to expect. > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343694#comment-15343694 ] Karthik Kambatla commented on YARN-5281: Thanks for filing this, [~jrottinghuis]. Copying contents of my email here: The reasons for my asking about alternate implementations: (1) ease of trying it out for Yarn devs and iteration for bug fixes, improvements and (2) ease of trying it for app-writers/users to figure out if they should use the ATS. A test implementation would be enough for #1, and would partially address #2. A more substantial implementation would be nice, but I guess we need to look at the ROI to decide whether adding that is a good idea. On completeness, I agree. Further, for some backend implementations, it is possible that a particular aggregation/query might be possible but too expensive to turn on. What are your thoughts on provisions for the admin to turn off some queries/aggregations? > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342636#comment-15342636 ] Sangjin Lee commented on YARN-5281: --- Thanks for opening the discussion [~jrottinghuis]. Others have already pointed out most of the key points. I would also very much like to see an alternate storage implementation although it is for test/non-production purposes. That said, creating an implementation that has feature parity is a fairly major undertaking as we found out with our local file-based implementation. It is not only creating an implementation that's not trivial, but is also *maintaining* it so that it keeps up with features that are being added. Just so that we're clear of the implications. > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342439#comment-15342439 ] Joep Rottinghuis commented on YARN-5281: [~kasha] raised a good point to focus on ensuring that we eliminate as many entry barriers as possible. I understand the concern about having to support a separate HBase installation for small Hadoop clusters. It isn't ideal if people have to become an HBase expert before using Yarn. I can open a separate jira to ship a dead-simple configuration where all needed processes for HBase can be launched on a single machine (perhaps by default where the user chooses to run the RM process). This would probably depend on YARN-5045. As [~varun_saxena] pointed out, we did start with trying to maintain both HBase and file-based implementation. As the features progressed and became more sophisticated it became increasingly more difficult to maintain feature parity, which is why we ultimately decided to move the file-based implementation to be for testing only. [~ozawa] highlighted the desire to have a simple implementation on HDFS. As I imagine what that would look like, especially to serve reads, filters and queries one would end up with what would essentially be a mini-HBase implementation. We've separately discussed a potential solution to spool writes to disk for if/when HBase is temporarily unavailable. We can see if that can be used to serve some use-cases to test out the API for writes. > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342389#comment-15342389 ] Joep Rottinghuis commented on YARN-5281: Capturing [~ozawa]'s reply: {noformat} Thanks Sangjin for starting the discussion. >> *First*, if the merge vote is approved, to which branch should this be merged and what would be the release version? As you mentioned, I think it's reasonable for us to target trunk and 3.0.0-alpha. >> Slightly unrelated to the merge, do we plan to support any other simpler backend for users to try out, in addition to HBase? LevelDB? > We can however, potentially change the Local File System based implementation to a HDFS based implementation and have it as an alternate for non-production use, In Apache Big Data 2016 NA, some users also mentioned that they need HDFS implementation. Currently it's pending, but I and Varun tried to work to support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful for small cluster(e.g. smaller than 10 machines). After merging the current implementation into trunk, I'm interested in resuming YARN-3874 work(maybe Varun is also interested in). Regards, - Tsuyoshi {noformat} > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2
[ https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342385#comment-15342385 ] Joep Rottinghuis commented on YARN-5281: Capturing [~varun_saxena]'s response on the email thread: {noformat} Thanks Karthik for sharing your views. With regards to merging, it would help to have clear documentation on how to setup and use ATS. --> We do have documentation on this. You and others who are interested can check out YARN-5174 which is the latest documentation related JIRA for ATSv2. Slightly unrelated to the merge, do we plan to support any other simpler backend for users to try out, in addition to HBase? LevelDB? --> We do have a File System based implementation but it is strictly for test purposes (as we write data into a local file). It does not support all the features of Timeline Service v.2 as well. Regarding LevelDB, Timeline Service v.2 has distributed writers and Level DB writes data (log files or SSTable files) to local file system. This means there will be no easy way to have a LevelDB based implementation because we would not know where to read the data from, especially while fetching flow level information. We can however, potentially change the Local File System based implementation to a HDFS based implementation and have it as an alternate for non-production use, if there is a potential need for it, based on community feedback. This however, would have to be further discussed with the team. Regards, Varun Saxena. {noformat} > Explore supporting a simpler back-end implementation for ATS v2 > --- > > Key: YARN-5281 > URL: https://issues.apache.org/jira/browse/YARN-5281 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis > > During the merge discussion [~kasha] raised the question whether we would > support simpler backend for users to try out, in addition to the HBase > implementation. > The understanding is that this would not be meant to scale, but it could > simplify initial adoption and early usage. > I'm filing this jira to gather the merits and challenges of such approach in > one place. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org