[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261187#comment-15261187
 ] 

Hitesh Shah commented on YARN-4844:
---

bq. Per my understanding, changing from int to long won't affect downstream 
project a lot, it's an error which can be captured by compiler directly. And 
getMemory/getVCores should not be used intensively by downstream project. For 
example, MR uses only ~20 times of getMemory()/VCores for non-testing code. 
Which can be easily fixed.

If you are going to force downstream apps to change, I dont understand why you 
are not forcing them to do this in the first 3.0.0 release? What benefit does 
this provide anyone by delaying it to some later 3.x.y release? It just means 
that you have do the production stability verification of upstream apps all 
over again. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257158#comment-15257158
 ] 

Wangda Tan commented on YARN-4844:
--

bq. ...Given the debate about the extent of the changes we want to make, can we 
put a patch that changes the int32 to int64, adds getMemoryLong with a Private 
annotation(so that we can make changes later if we wish) and only fixes the 
pending memory check that was added in 2.8?...
I agree size of the patch looks scary :-p, however, if you look into the patch, 
they're all very simple fixes, I don't think it will cause a lot of issues. You 
may feel better once I fixed all Jenkins issues.
I have considered fix the pending resource calculation only, it looks hard to 
me. Because calculation of pending resource uses 
ResourceCalculator/ResourceUsage. And ResourceCalculator and related static 
methods of Resources used everywhere in RM.
It's a good idea to me to mark get___Long to @Private, currently pending 
resource hasn't been exposed to application via Java API yet. Now it is only 
exposed in REST API which is fixed by the patch already.

Thoughts?

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257142#comment-15257142
 ] 

Wangda Tan commented on YARN-4844:
--

bq. So the plan is to force users to change their usage of these APIs in some 
version of 3.x but not in 3.0.0 ?
Regardless of debates about the first release of 3.x, let's assume it happens 
soon.
The plan in my mind is to make sure incompatible API changes are get in when 
3.x enters beta releases. We have a couple of other API changes on the way, in 
YARN for example, ATSv2, new web UI, etc.

bq. Additionally we are not talking about use in production but rather making 
upstream apps change as needed to work with 3.x and over time stabilize 3.x.
Per my understanding, changing from int to long won't affect downstream project 
a lot, it's an error which can be captured by compiler directly. And 
getMemory/getVCores should not be used intensively by downstream project. For 
example, MR uses only ~20 times of getMemory()/VCores for non-testing code. 
Which can be easily fixed.

bq. Making an API change earlier rather than later is actually better as the 
API changes in this case have no relevance to production stability.
I agree that it won't affect production stability. However, it adds additional 
overhead to development works which I don't want.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-25 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256606#comment-15256606
 ] 

Varun Vasudev commented on YARN-4844:
-

I'm a little concerned with the size of the changes required, coming in so 
close to the 2.8 release. Given the debate about the extent of the changes we 
want to make, can we put a patch that changes the int32 to int64, adds 
getMemoryLong with a Private annotation(so that we can make changes later if we 
wish) and only fixes the pending memory check that was added in 2.8? We can do 
a follow up on how to fix this in the wider community. What do you think 
[~leftnoteasy]?

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254970#comment-15254970
 ] 

Hitesh Shah commented on YARN-4844:
---

Additionally we are not talking about use in production but rather making 
upstream apps change as needed to work with 3.x and over time stabilize 3.x. 
Making an API change earlier rather than later is actually better as the API 
changes  in this case have no relevance to production stability. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254967#comment-15254967
 ] 

Hitesh Shah commented on YARN-4844:
---

bq.  considering there are hundreds of blockers and criticals of 3.0.0 release, 
nobody will actually use the new release in production even if 3.0-alpha can be 
released. We can mark Resource API of trunk to be unstable and update it in 
future 3.x releases.

So the plan is to force users to change their usage of these APIs in some 
version of 3.x but not in 3.0.0 ? 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254944#comment-15254944
 ] 

Hadoop QA commented on YARN-4844:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 53 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 50s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 61 new + 
1404 unchanged - 47 fixed = 1465 total (was 1451) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 47s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 55s {color} 
| {color:red} hadoop-yarn-common in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s 
{color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} |

[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254897#comment-15254897
 ] 

Wangda Tan commented on YARN-4844:
--

[~hitesh], considering there are hundreds of blockers and criticals of 3.0.0 
release, nobody will actually use the new release in production even if 
3.0-alpha can be released. We can mark Resource API of trunk to be unstable and 
update it in future 3.x releases.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253472#comment-15253472
 ] 

Hadoop QA commented on YARN-4844:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 53 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 54s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 4m 25s {color} | 
{color:red} hadoop-yarn-project_hadoop-yarn-jdk1.8.0_77 with JDK v1.8.0_77 
generated 1 new + 2 unchanged - 1 fixed = 3 total (was 3) {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 48s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 61 new + 
1408 unchanged - 47 fixed = 1469 total (was 1455) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 54s {color} 
| {color:red} hadoop-yarn-common in the patch failed with JDK 

[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253084#comment-15253084
 ] 

Hitesh Shah commented on YARN-4844:
---

bq. It is not a very hard thing to drop it, we'd better to do it close to first 
branch-3 release.

I believe a recent comment on the mailing list was trying to target a 3.0 
release within the next few weeks so I guess that means we make this change 
now? 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch, YARN-4844.2.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253048#comment-15253048
 ] 

Wangda Tan commented on YARN-4844:
--

[~hitesh],

Agree it is very messy but it seems there's no other way to do it. :(
I also tried to add new Resource object (like YarnServerResource) which will be 
used by YARN internally only and keeps user-facing API clean. But there're too 
many interactions between application and services use api.Resource, we need to 
handle all these cases separately, it doesn't look like a doable plan to me.

For branch-3 release, I prefer to keep {{long getMemory/etc}} only. However, 
since getMemory is used 1000+ times inside resource manager project, if we drop 
the {{Resource#int getMemory}} now from trunk, we need to write two versions of 
patches for almost RM fixes, which will be a HUGE headache for YARN RM 
contributors. It is not a very hard thing to drop it, we'd better to do it 
close to first branch-3 release.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253028#comment-15253028
 ] 

Hitesh Shah commented on YARN-4844:
---

getMemoryLong(), etc just seems messy. I can understand why this is needed on 
branch-2 if we need to support long but for trunk, it seems better to change 
getMemory() to return a long. 

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253005#comment-15253005
 ] 

Hadoop QA commented on YARN-4844:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} YARN-4844 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12800125/YARN-4844.1.patch |
| JIRA Issue | YARN-4844 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/11170/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253004#comment-15253004
 ] 

Wangda Tan commented on YARN-4844:
--

An additional note about why it is a 2.8 blocker:

Currently Capacity Scheduler relies on total pending resource:
When trying to assign container for each node heartbeat from root queue, it 
skips queue which has <= 0 pending resources.

So if memory pendingResource overflows, no more containers can be allocated.

branch-2.7 will not affected since the new logic is only in branch-2.8.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4844) Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64

2016-04-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252976#comment-15252976
 ] 

Wangda Tan commented on YARN-4844:
--

Discussed with [~vinodkv], [~jianhe], [~hitesh] about this issue.

A good news is, Google PB has backward/forward compatibility for all int_ 
fields, see: https://developers.google.com/protocol-buffers/docs/proto#updating:
bq. int32, uint32, int64, uint64, and bool are all compatible – this means you 
can change a field from one of these types to another without breaking 
forwards- or backwards-compatibility. If a number is parsed from the wire which 
doesn't fit in the corresponding type, you will get the same effect as if you 
had cast the number to that type in C++ (e.g. if a 64-bit number is read as an 
int32, it will be truncated to 32 bits).
So we have no problem to change ResourceProto from int32 to int64.

In addition to .proto change, following changes are required for API record : 
Resource.
- Update {{set_(int ...)}} to {{set_(long ...)}}, there's no compatible issue 
for setters
- Add {{getMemoryLong}} and {{getVirtualCoresLong}} method

And also, we need update Metrics objects related to Resources, such as 
QueueMetrics, etc. AFAIK, there's no compatibility issue.

The last part is scheduler and test fixes.

Attached ver.1 patch for review.

> Upgrade fields of o.a.h.y.api.records.Resource from int32 to int64
> --
>
> Key: YARN-4844
> URL: https://issues.apache.org/jira/browse/YARN-4844
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-4844.1.patch
>
>
> We use int32 for memory now, if a cluster has 10k nodes, each node has 210G 
> memory, we will get a negative total cluster memory.
> And another case that easier overflows int32 is: we added all pending 
> resources of running apps to cluster's total pending resources. If a 
> problematic app requires too much resources (let's say 1M+ containers, each 
> of them has 3G containers), int32 will be not enough.
> Even if we can cap each app's pending request, we cannot handle the case that 
> there're many running apps, each of them has capped but still significant 
> numbers of pending resources.
> So we may possibly need to upgrade int32 memory field (could include v-cores 
> as well) to int64 to avoid integer overflow. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)