[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2019-08-29 Thread Abhishek Modi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919222#comment-16919222
 ] 

Abhishek Modi commented on YARN-5281:
-

We are already supporting Hbase and CosmosDB as backend. We also support 
local/hdfs filesystem with limited capabilities. Based on the discussion, I 
don't think it would be possible to support simpler backend with all 
functionalities without re-implementing some part of the features provided by 
Hbase/CosmosDB.

For single node setup, ATSv2 can still be used with limited functionalities 
using local filesystem as backend.

If no one is actively working on this, I would close this as part of Jira 
cleanup for ATSv2.

cc [~vrushalic]/[~rohithsharma]

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>Priority: Major
>  Labels: YARN-5355
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-22 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344639#comment-15344639
 ] 

Sangjin Lee commented on YARN-5281:
---

If we're open to the limitation of a single node setup, there could be more 
options (levelDB, etc.).

In the future, we are definitely open to production-ready implementations that 
can be alternatives to the HBase implementation. We tried to make the storage 
as pluggable as possible, and probably we could do more to facilitate this down 
the road.

As you mentioned, we could also consider having configurations that turn off 
some features (aggregation or some type of queries) if they turn out to be too 
challenging for certain implementations. But that would need to come with the 
explicit acknowledgment that one would lose some functionalities by doing so. 
For example, if flow run aggregation is turned off, users need to know what to 
expect.

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343694#comment-15343694
 ] 

Karthik Kambatla commented on YARN-5281:


Thanks for filing this, [~jrottinghuis]. Copying contents of my email here:

The reasons for my asking about alternate implementations: (1) ease of trying 
it out for Yarn devs and iteration for bug fixes, improvements and (2) ease of 
trying it for app-writers/users to figure out if they should use the ATS. 

A test implementation would be enough for #1, and would partially address #2. A 
more substantial implementation would be nice, but I guess we need to look at 
the ROI to decide whether adding that is a good idea. 

On completeness, I agree. Further, for some backend implementations, it is 
possible that a particular aggregation/query might be possible but too 
expensive to turn on. What are your thoughts on provisions for the admin to 
turn off some queries/aggregations? 

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342636#comment-15342636
 ] 

Sangjin Lee commented on YARN-5281:
---

Thanks for opening the discussion [~jrottinghuis].

Others have already pointed out most of the key points. I would also very much 
like to see an alternate storage implementation although it is for 
test/non-production purposes. That said, creating an implementation that has 
feature parity is a fairly major undertaking as we found out with our local 
file-based implementation. It is not only creating an implementation that's not 
trivial, but is also *maintaining* it so that it keeps up with features that 
are being added. Just so that we're clear of the implications.

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-21 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342439#comment-15342439
 ] 

Joep Rottinghuis commented on YARN-5281:


[~kasha] raised a good point to focus on ensuring that we eliminate as many 
entry barriers as possible. I understand the concern about having to support a 
separate HBase installation for small Hadoop clusters. It isn't ideal if people 
have to become an HBase expert before using Yarn. I can open a separate jira to 
ship a dead-simple configuration where all needed processes for HBase can be 
launched on a single machine (perhaps by default where the user chooses to run 
the RM process). This would probably depend on YARN-5045.

As [~varun_saxena] pointed out, we did start with trying to maintain both HBase 
and file-based implementation. As the features progressed and became more 
sophisticated it became increasingly more difficult to maintain feature parity, 
which is why we ultimately decided to move the file-based implementation to be 
for testing only.

[~ozawa] highlighted the desire to have a simple implementation on HDFS. As I 
imagine what that would look like, especially to serve reads, filters and 
queries one would end up with what would essentially be a mini-HBase 
implementation.

We've separately discussed a potential solution to spool writes to disk for 
if/when HBase is temporarily unavailable. We can see if that can be used to 
serve some use-cases to test out the API for writes.

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-21 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342389#comment-15342389
 ] 

Joep Rottinghuis commented on YARN-5281:


Capturing [~ozawa]'s reply:
{noformat}
Thanks Sangjin for starting the discussion.

>> *First*, if the merge vote is approved, to which branch should this be
merged and what would be the release version?

As you mentioned, I think it's reasonable for us to target trunk and
3.0.0-alpha.

>> Slightly unrelated to the merge, do we plan to support any other simpler
backend for users to try out, in addition to HBase? LevelDB?
> We can however, potentially change the Local File System based
implementation to a HDFS based implementation and have it as an alternate
for non-production use,

In Apache Big Data 2016 NA, some users also mentioned that they need HDFS
implementation. Currently it's pending, but I and Varun tried to work to
support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful
for small cluster(e.g. smaller than 10 machines). After merging the current
implementation into trunk, I'm interested in resuming YARN-3874 work(maybe
Varun is also interested in).

Regards,
- Tsuyoshi
{noformat}

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5281) Explore supporting a simpler back-end implementation for ATS v2

2016-06-21 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342385#comment-15342385
 ] 

Joep Rottinghuis commented on YARN-5281:


Capturing [~varun_saxena]'s response on the email thread:
{noformat}
Thanks Karthik for sharing your views.

With regards to merging, it would help to have clear documentation on how to 
setup and use ATS.
--> We do have documentation on this. You and others who are interested can 
check out YARN-5174 which is the latest documentation related JIRA for ATSv2.

Slightly unrelated to the merge, do we plan to support any other simpler 
backend for users to try out, in addition to HBase? LevelDB?
--> We do have a File System based implementation but it is strictly for test 
purposes (as we write data into a local file). It does not support all the 
features of Timeline Service v.2 as well.
Regarding LevelDB, Timeline Service v.2 has distributed writers and Level DB 
writes data (log files or SSTable files) to local file system. This means there 
will be no easy way to have a LevelDB based implementation because we would not 
know where to read the data from, especially while fetching flow level 
information.
We can however, potentially change the Local File System based implementation 
to a HDFS based implementation and have it as an alternate for non-production 
use, if there is a potential need for it, based on community feedback. This 
however, would have to be further discussed with the team.

Regards,
Varun Saxena.
{noformat}

> Explore supporting a simpler back-end implementation for ATS v2
> ---
>
> Key: YARN-5281
> URL: https://issues.apache.org/jira/browse/YARN-5281
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>
> During the merge discussion [~kasha] raised the question whether we would 
> support simpler backend for users to try out, in addition to the HBase 
> implementation.
> The understanding is that this would not be meant to scale, but it could 
> simplify initial adoption and early usage.
> I'm filing this jira to gather the merits and challenges of such approach in 
> one place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org