Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Karthik Kambatla Tue, 21 Jun 2016 21:29:45 -0700

For the catch up, I meant during Hadoop Summit next week.

On Tue, Jun 21, 2016 at 10:28 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:


> The reasons for my asking about alternate implementations: (1) ease of
> trying it out for Yarn devs and iteration for bug fixes, improvements and
> (2) ease of trying it for app-writers/users to figure out if they should
> use the ATS. Again, personally, I don't see this as necessary for the merge
> itself, but more so for adoption.
>
> A test implementation would be enough for #1, and would partially address
> #2. A more substantial implementation would be nice, but I guess we need to
> look at the ROI to decide whether adding that is a good idea.
>
> On completeness, I agree. Further, for some backend implementations, it is
> possible that a particular aggregation/query might be possible but too
> expensive to turn on. What are your thoughts on provisions for the admin to
> turn off some queries/aggregations?
>
> Orthogonal: is there interest here to catch up on ATS specifically one of
> the days? May be, during the breaks or after the sessions?
>
> On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <l...@hortonworks.com> wrote:
>
>> HDFS or other non-HBase implementations are very helpful. We didn’t focus
>> on those implementations in the first milestone because we would like to
>> have one working version as a starting point. We can certainly add more
>> implementations when the feature gets more mature.
>>
>> This said, one of my concerns when building these storage implementations
>> is “completeness”. We have added a lot of supports to data aggregation. As
>> of today, part of the aggregation (flow run aggregation) may be performed
>> as HBase coprocessors. When implementing comparable storage impls, it is
>> worth noting that one may want to provide some equivalent things to perform
>> those aggregations (to really make one implementation “complete enough”,
>> or, “interchangeable” to the existing HBase impl).
>>
>> Li Lu
>> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
>> >
>> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
>> like
>> > to get a better sense of what you're thinking of. Are you interested in
>> > strictly a test implementation (e.g. perfectly fine in a single node
>> setup)
>> > or a more substantial implementation (may not scale but needs to work
>> in a
>> > more realistic setup)?
>> >
>> > Regards,
>> > Sangjin
>> >
>> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jrottingh...@gmail.com
>> >
>> > wrote:
>> >
>> >> Thanks Karthik and Tsuyoshi for bringing up good points.
>> >>
>> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
>> this
>> >> discussion and capture all the merits and challenges in one single
>> place.
>> >>
>> >> Thanks,
>> >>
>> >> Joep
>> >>
>> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
>> wrote:
>> >>
>> >>> Thanks Sangjin for starting the discussion.
>> >>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>> merged and what would be the release version?
>> >>>
>> >>> As you mentioned, I think it's reasonable for us to target trunk and
>> >>> 3.0.0-alpha.
>> >>>
>> >>>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use,
>> >>>
>> >>> In Apache Big Data 2016 NA, some users also mentioned that they need
>> HDFS
>> >>> implementation. Currently it's pending, but I and Varun tried to work
>> to
>> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
>> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
>> useful
>> >>> for small cluster(e.g. smaller than 10 machines). After merging the
>> >> current
>> >>> implementation into trunk, I'm interested in resuming YARN-3874
>> >> work(maybe
>> >>> Varun is also interested in).
>> >>>
>> >>> Regards,
>> >>> - Tsuyoshi
>> >>>
>> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
>> varun.sax...@huawei.com>
>> >>> wrote:
>> >>>> Thanks Karthik for sharing your views.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>> --> We do have documentation on this. You and others who are
>> interested
>> >>> can check out YARN-5174 which is the latest documentation related JIRA
>> >> for
>> >>> ATSv2.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> --> We do have a File System based implementation but it is strictly
>> >> for
>> >>> test purposes (as we write data into a local file). It does not
>> support
>> >> all
>> >>> the features of Timeline Service v.2 as well.
>> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
>> >> Level
>> >>> DB writes data (log files or SSTable files) to local file system. This
>> >>> means there will be no easy way to have a LevelDB based implementation
>> >>> because we would not know where to read the data from, especially
>> while
>> >>> fetching flow level information.
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use, if there is a potential need for it, based on
>> >>> community feedback. This however, would have to be further discussed
>> with
>> >>> the team.
>> >>>>
>> >>>> Regards,
>> >>>> Varun Saxena.
>> >>>>
>> >>>> -----Original Message-----
>> >>>> From: Karthik Kambatla [mailto:ka...@cloudera.com]
>> >>>> Sent: 21 June 2016 10:29
>> >>>> To: Sangjin Lee
>> >>>> Cc: yarn-dev@hadoop.apache.org
>> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> >> trunk
>> >>>>
>> >>>> Firstly, thanks Sangjin and others for driving this major feature.
>> >>>>
>> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as
>> it
>> >>> will give early access to downstream users.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB? I
>> understand
>> >>> this wouldn't scale, but would it help with initial adoption and
>> feedback
>> >>> from early users?
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
>> >> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
>> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
>> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>> >>>>> while, and we are reasonably confident that the state of the feature
>> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
>> get
>> >>>>> their hands on it and provide valuable feedback so that we can make
>> it
>> >>> production-ready.
>> >>>>>
>> >>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
>> >>>>> and usability improvements based on a new architecture. You can
>> browse
>> >>>>> the requirements/design doc, the storage schema doc, the new
>> >>>>> entity/data model, the YARN documentation, and also discussions on
>> >>>>> subsequent milestones on
>> >>>>> YARN-2928 [1].
>> >>>>>
>> >>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
>> >>>>> 1). The feature has a complete end-to-end read/write flow, and you
>> >>>>> should be able to start setting it up and testing it. At a high
>> level,
>> >>>>> the following are the key features that have been implemented:
>> >>>>>
>> >>>>> - distributed writers (collectors) as NM aux services
>> >>>>> - HBase storage
>> >>>>> - new entity model that includes flows
>> >>>>> - setting the flow context via YARN app tags
>> >>>>> - real time metrics aggregation to the application level and the
>> flow
>> >>>>> level
>> >>>>> - rich REST API that supports filters, complex conditionals, limits,
>> >>>>> content selection, etc.
>> >>>>> - YARN generic events and system metrics
>> >>>>> - integration with Distributed Shell and MapReduce
>> >>>>>
>> >>>>> There are a total of 139 subtasks that were completed as part of
>> this
>> >>>>> effort.
>> >>>>>
>> >>>>> We paid close attention to ensure that once disabled Timeline
>> Service
>> >>>>> v.2 does not impact existing functionality when disabled (by
>> default).
>> >>>>>
>> >>>>> I'd like to call out a couple of things to discuss in particular.
>> >>>>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>>>> merged and what would be the release version? My preference is that
>> >>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1*
>> if
>> >>> approved.
>> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
>> >>>>> thoughts on this.
>> >>>>>
>> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
>> >>> YARN.
>> >>>>> It is not a cyclical dependency (as HBase does not really depend on
>> >>> YARN).
>> >>>>> However, the version of Hadoop that HBase currently supports lags
>> >>>>> behind the Hadoop version that Timeline Service is based on, so
>> there
>> >>>>> is a potential for subtle dependency conflicts. We made some efforts
>> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
>> been
>> >>>>> responsive in keeping up with the trunk as much as they can.
>> >>>>> Nonetheless, this is something to keep in mind.
>> >>>>>
>> >>>>> I would love to get your thoughts on these and more before we open a
>> >>>>> real voting thread. Thanks!
>> >>>>>
>> >>>>> Regards,
>> >>>>> Sangjin
>> >>>>>
>> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>> >>>>> [2] MAPREDUCE-6331:
>> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>> >>>>> [3] YARN-2928 commits:
>> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
>> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>> >>>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> >>>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>> >>>>
>> >>>
>> >>
>>
>>
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Reply via email to