[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802738#comment-17802738 ] Shilun Fan commented on YARN-5814: -- Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0. > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu >Priority: Major > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042759#comment-16042759 ] daemon commented on YARN-5814: -- [~BINGXUE QIU] hi, bingxue,can you upload your patch? Your patch is very useful for me! > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676080#comment-15676080 ] Bingxue Qiu commented on YARN-5814: --- Thanks [~sjlee0] for your Suggestions! On the Druid reader side, queries are based on the Drill. So the conditions like filter list can supported by self-join,left-join. such as: {code} select F.* FROM druid.timeline_service_app F, druid.timeline_service_app S WHERE F.appId = S.appId AND F.startTime > 1479440083000 AND S.finishTime > 0 AND F.appId = 'application_1476875405903_49989'; {code} I also feel deeply grateful that you reminding me the new issues, druid support order by column, maybe add a column named "idPrefix" make sense? > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670054#comment-15670054 ] Bingxue Qiu commented on YARN-5814: --- Thanks [~gtCarrera9] for your suggestions: 1. For the writer design issues, We have implemented writer by kafka and a mr job (for HA) pull data to the realtime nodes of druid. But I'm not so sure this method is also fit for others. After all tranquility is more simple. I will give the design of them later. we can choose to implement one or both of them. 2. For the table design, it may not be fit for using timeline.entity table to hold general timeline entities including container data in druid implementation. In HBase implementation, we can store general timeline entities with column family in entity table and scan them by rowkey. But druid is fixed schema column storage, if we need ad-hoc/agg in real-time, timeline.entity table maybe a wide table with many columns. It would bring the data redundancy and generate many rows and increase cache miss. That's why we consider to add these tables but not timeline.entity Please feel free to give your suggestions. Thanks! > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668610#comment-15668610 ] Sangjin Lee commented on YARN-5814: --- Thanks [~BINGXUE QIU] for adding more details. It sounds like a good starting point. A couple of points. It would be great to see some more details on the reader side of things. In the case of HBase/Phoenix, more sophisticated analytical queries are going to be based on the Phoenix (SQL) schema. In case of Druid, it would be interesting to see some examples being served right out of the same schema. Also, with readers we do a fair amount of predicate (filter) pushdown. We have the set of timeline service filters which then get translated to HBase filters and are pushed down to the queries. Have you looked at what kind of predicate pushdown would be feasible in case of Druid? I'd also like to bring your attention to the entity id prefix we just introduced (YARN-5715) to solve the problem of more natural sort/selection order. The lexicographical sort order based on the entity id's is going to fall short of the expectation, thus the reason for introducing the entity id prefix. You might want to take a look at that, and see how that would translate to the Druid schema. I'll comment more if I have more things to think about. > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665393#comment-15665393 ] Li Lu commented on YARN-5814: - Thanks [~BINGXUE QIU] for the doc! I have some quick questions: 1. According to the Design section, the writer may require tranquility and/or kafka as intermediate layers. I'm wondering if there are any issues with these dependencies? 2. For the table design, right now in timeline v.2, container is not a top-level concept (although it is a top-level concept for YARN). Therefore I'm not sure if it is helpful to generalize the container table to an entity table, just as the HBase implementation? We may still put container level data into this table, but maybe it's possible to not to limit this table to container only? > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665337#comment-15665337 ] Li Lu commented on YARN-5814: - Linking this issue the the umbrella JIRA of timeline v.2. > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656598#comment-15656598 ] Bingxue Qiu commented on YARN-5814: --- I have uploaded the design. It contains our ideas about druid writer, reader and schema. Please feel free to give your suggestions. Thanks > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > Attachments: Add-Druid-in-YARN-Timeline-Service.pdf > > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635734#comment-15635734 ] Bingxue Qiu commented on YARN-5814: --- Thanks [~djp], [~sjlee0] for your support! I will give a more concrete design next week > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635730#comment-15635730 ] Bingxue Qiu commented on YARN-5814: --- Thanks [~djp], [~sjlee0] for your support! I will give a more concrete design next week > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633581#comment-15633581 ] Junping Du commented on YARN-5814: -- Thanks [~sjlee0] for positive feedback. Current ATS v2 weekly call doesn't work for him in Beijing (mid-night), I can help to corodinate some calls for initiative efforts and design as I know guys from both side (I saw their demo in my recent visit to Beijing). > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633432#comment-15633432 ] Sangjin Lee commented on YARN-5814: --- Thanks [~BINGXUE QIU] for your proposal! I am certainly +1 on the proposal. It would give people more choice in terms of what they want to use as their storage backend. The HBase backend is chosen primarily for its scalability, and the basic data is more about OLTP than OLAP. The OLAP side of things would be implemented via Phoenix offline aggregation. We tried our best to be pluggable when it comes to swapping storage backends, but no doubt there are places where the HBase details leak. We would need to address them together. As Junping mentioned, I also look forward to a more detailed proposal. The group that's working closely on the timeline service v2. has weekly status calls, but the timing might not work out well for you. Please do reach out to me via email to see how we can work more closely. > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service
[ https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633315#comment-15633315 ] Junping Du commented on YARN-5814: -- Thanks [~BINGXUE QIU] for reporting this issue. I think this use case and implementation from Alibaba could benefit our community for several reasons: 1. It will show case our ATS v2 design and implementation are flexible to different storage backend due to different use cases. It can be NoSQL (HBase), filesystem (HDFS, guys from NTT seems to work on it) and of course, some OLAP implementations. 2. Our current backend implementation on HBase is lacking of ad-hoc query on timeline info. In our previous assumption for accessing these timeline info in limited ways - like getting runtime or offline aggregation info from UI, it won't be a problem. However, if we would like to support the case of interactive queries for timeline info on a large and busy cluster, HBase may not be the best fit. I believe there could be other YARN users than Alibaba to have similar requirements if we are thinking analysis of yarn application info is really a big data problem, and the proposed effort can expand our ATS v2 scenario. I think we should consider to merge this proposal to our ATS v2 ongoing effort (may be under YARN-5355?) if [~BINGXUE QIU] can work out a more concrete design. ATS v2 folks ([~sjlee0], [~vinodkv], [~gtCarrera9], [~vrushalic], [~jrottinghuis], [~varun_saxena], [~Naganarasimha] and [~rohithsharma]), what do you guys think? > Add druid as storage backend in YARN Timeline Service > -- > > Key: YARN-5814 > URL: https://issues.apache.org/jira/browse/YARN-5814 > Project: Hadoop YARN > Issue Type: New Feature > Components: ATSv2 >Affects Versions: 3.0.0-alpha2 >Reporter: Bingxue Qiu > > h3. Introduction > I propose to add druid as storage backend in YARN Timeline Service. > We run more than 6000 applications and generate 450 million metrics daily in > Alibaba Clusters with thousands of nodes. We need to collect and store > meta/events/metrics data, online analyze the utilization reports of various > dimensions and display the trends of allocation/usage resources for cluster > by joining and aggregating data. It helps us to manage and optimize the > cluster by tracking resource utilization. > To achieve our goal we have changed to use druid as the storage instead of > HBase and have achieved sub-second OLAP performance in our production > environment for few months. > h3. Analysis > Currently YARN Timeline Service only supports aggregating metrics at a) flow > level by FlowRunCoprocessor and b) application level metrics aggregating by > AppLevelTimelineCollector, offline (time-based periodic) aggregation for > flows/users/queues for reporting and analysis is planned but not yet > implemented. YARN Timeline Service chooses Apache HBase as the primary > storage backend. As we all know that HBase doesn't fit for OLAP. > For arbitrary exploration of data,such as online analyze the utilization > reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by > joining and aggregating data, Druid's custom column format enables ad-hoc > queries without pre-computation. The format also enables fast scans on > columns, which is important for good aggregation performance. > To achieve our goal that support to online analyze the utilization reports of > various dimensions, display the variation trends of allocation/usage > resources for cluster, and arbitrary exploration of data, we propose to add > druid storage and implement DruidWriter /DruidReader in YARN Timeline Service. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org