[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184146#comment-14184146
 ] 

Wangda Tan commented on YARN-2495:
--

{code}
if number of labels are more than the weight of heartbeat message will be more 
and if the cluster nodes are more than more network IO
checking of labels from Prev state to current state for all nodes is done in RM 
in earlier method each NM was taking care of it self.
Lot of read locks needs to be held @ NodesLabelsManager
{code}
Make sense, let's do that way

bq. Also as we either accept all labels or reject all labels can we have a flag 
whether RM accepted the labels or not ? and modify response proto when the 
NodeLabelsManager Interface changes ?
Also make sense to me, and I think a corner case is, if a NM ask for remove all 
labels (pass a empty list), and how RM set reject list? So I agree to use a 
flag say if the last sync about node labels is success or not

bq. RegisterNodeManagerRequestProto and NodeHeartbeatRequestProto are modified 
in this file ... diff shows one line after the modification which looks like i 
have modified RegisterNodeManagerResponseProto
Yes, you're correct, I misread the patch.

bq. So do we need to do address this scenario by adding some boolean flag for 
DeCentralizedConfigEnable in RegisterNodeManagerRequestProto ?
I think we shouldn't, in such case, we should let RM follow what configured in 
RM side. Basically, it's a configuration error should be avoid.
So if RM=centralized, NM=decentralized, just print error and reject such 
labels. 
If RM=decentralized, NM=centralized, client side will receive error message. 

{code}
Unknown macro: { createNodeStatusUpdater(context, dispatcher, 
nodeHealthChecker,nodeLabelsProviderService); }
is it ok ?
{code}
That's fine, basically I think we should avoid modify it everywhere.

Thanks,
Wangda

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-25 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184140#comment-14184140
 ] 

Naganarasimha G R commented on YARN-2495:
-

{quote}
3) NodeManager:
createNodeStatusUpdater : I suggest to create a overload method without the 
nodeLabelsProviderService to avoid lots of changes in test/mock classes.
{quote}
NodeManager during initservice method is calling a protected 
createNodeStatusUpdater method. So even though we add overloaded method 
initservice will call the method which takes
nodeLabelsProviderService or we need to add code like 
{quote} 
if(null==nodeLabelsProviderService){
createNodeStatusUpdater(context, dispatcher, nodeHealthChecker);
}else{
createNodeStatusUpdater(context, dispatcher, 
nodeHealthChecker,nodeLabelsProviderService);
}
{quote} 
is it ok ?

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-25 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184133#comment-14184133
 ] 

Naganarasimha G R commented on YARN-2495:
-

{quote}
2.NodeHeartbeatRequestPBImpl:
2.3. Everytime set the up-to-date labels to NodeHeartbeatRequest when 
do heartbeat.
#1 and #2 will all need add more fields in NodeHeartbeatRequest. I suggest to 
do in #3, it's more simple and we can improve it in further JIRA. 
{quote}

I missed to discuss about this point in particular. If we do the 3rd approach, 
on every heartbeat in RM side we need to take either of the 2 approach 
* Invoke getNodeLabelManager().replaceLabelsOnNode() which will 
validate the labels and then replace even though there is no change in the 
labels
* Resource tracker service validates the labels from the heartbeat with 
the NodeLabels manager and if modified Invoke 
getNodeLabelManager().replaceLabelsOnNode() 

Both of these approach are costlier because of following impacts :
* if number of labels are more than the weight of heartbeat message 
will be more and if the cluster nodes are more than more network IO
* checking of labels from Prev state to current state for all nodes is 
done in RM in earlier method each NM was taking care of it self.
* Lot of read locks needs to be held @ NodesLabelsManager 

So based on these points would like to prefer option 1 as its minimal change 
and anyway we have already modified the request and response (proposed reject 
labels list)

{quote}
 5. NodeStatusUpdaterImpl: 
* In RM ResourceTracker, if exception raise when replace labels on node, put 
the new labels to reject node labels to response.
* In NM NodeStatusUpdater, if reject node labels is not null, LOG.error 
rejected node labels, and print diagnostic message.
{quote}
you meant "reject node labels is not empty" right based on comment2 null cannot 
be identified for repeated fields
Also as we either accept all labels or reject all labels can we have a flag 
whether RM accepted the labels or not ? and modify response proto when the 
NodeLabelsManager Interface changes ?

bq. 6) yarn_server_common_service_protos.proto
RegisterNodeManagerRequestProto and NodeHeartbeatRequestProto are modified in 
this file ... diff shows one line after the modification which looks like i 
have modified RegisterNodeManagerResponseProto

bq. 9 It no need to send shutdown message when any of the labels not accepted 
by RMNodeLabelsManager. Just add them to a reject node labels list, and add 
diagnostic message should be enough.
k this will take care but earlier i tried to send shutdown message because i 
wanted avoid the scenario where RM is confiugred for CentralNodeLabel and NM 
for distributed
But as per your earlier comment ??"PB cannot tell difference between null and 
empty for repeated fields."??. 
So do we need to do address this scenario by adding some boolean flag for 
DeCentralizedConfigEnable in RegisterNodeManagerRequestProto ?



> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-24 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183135#comment-14183135
 ] 

Wangda Tan commented on YARN-2495:
--

Hi Naga,
Thanks for working on this patch, 
Comments round #1,

1) YarnConfiguration:
I think we should add a DEFAULT_DECENTRALIZED_NODELABEL_CONFIGURATION_ENABLED = 
false to avoid hardcode the "false" in implementations

2) NodeHeartbeatRequestPBImpl:
I just found current PB cannot tell difference between null and empty for 
repeated fields. And in your implementation, empty set will be always returned 
no matter the field is not being set or set an empty set.
So what we defined null for "not changed", empty for "no label" not establish 
any more.
What we can do is,
# Add a new field in NodeHeartbeatRequest, like "boolean nodeLabelUpdated".
# Use the add/removeLabelsOnNodes API provided by RMNodeLabelsManager, 
everytime pass the changed labels only.
# Everytime set the up-to-date labels to NodeHeartbeatRequest when do heartbeat.

#1 and #2 will all need add more fields in NodeHeartbeatRequest. I suggest to 
do in #3, it's more simple and we can improve it in further JIRA.

3) NodeManager:
{code}
+if (conf.getBoolean(
+YarnConfiguration.ENABLE_DECENTRALIZED_NODELABEL_CONFIGURATION, 
false)) {
{code}
Instead of hardcode "false" here, we should use 
"DEFAULT_DECENTRALIZED_NODELABEL_CONFIGURATION_ENABLED" instead.

bq. +  addService((Service) provider);
Why do this type conversion? I think we don't need it.

bq. createNodeStatusUpdater
I suggest to create a overload method without the nodeLabelsProviderService to 
avoid lots of changes in test/mock classes.

4) NodeLabelsProviderService:
It should extends AbstractService, there're some default implementations in 
AbstractService, we don't need implement all of them.

5) NodeStatusUpdaterImpl:
{{isDecentralizedNodeLabelsConf}} may not need here, if nodeLablesProvider 
passed in is null. That means {{isDecentralizedNodeLabelsConf}} is false. 

{code}
+nodeLabelsForHeartBeat = null;
+if (isDecentralizedNodeLabelsConf) {
...
{code}
According to my comment 2), I suggest to make it simple -- if provider is not 
null, set NodeHeartbeatRequest.nodeLabels to labels get from provider.

{code}
+if (nodeLabelsForHeartBeat != null
+&& response.getDiagnosticsMessage() != null) {
+  LOG.info("Node Labels were rejected from RM "
+  + response.getDiagnosticsMessage());
+}
{code}
We cannot assume when diagosticMessage is not null, it is the node label 
rejected. I sugguest to add rejected-node-labels field to RegisterNMResponse 
and NodeHeartbeatResponse. Existing behavior in RMNodeLabelsManager is, if any 
of the labels is not valid, all labels will be rejected. What you should do is,
# In RM ResourceTracker, if exception raise when replace labels on node, put 
the new labels to reject node labels to response.
# In NM NodeStatusUpdater, if reject node labels is not null, LOG.error 
rejected node labels, and print diagnostic message.

As 3) suggested, create an overload constructor to avoid lots of changes in 
tests.

6) yarn_server_common_service_protos.proto
I think you miss adding nodeLabels to {{RegisterNodeManagerResponseProto}}, 
which should be in {{RegisterNodeManagerRequestProto}} ? :)

7) ConfigurationNodeLabelsProvider:
{code}
+String[] nodeLabelsFromScript =
+
StringUtils.getStrings(conf.get(YarnConfiguration.NM_NODE_LABELS_PREFIX, ""));
{code}
# nodeLabelsFromScript -> nodeLabelsFromConfiguration
# YarnConfiguration.NM_NODE_LABELS_PREFIX -> add an option like 
YarnConfiguration.NM_NODE_LABELS_FROM_CONFIG (NM_NODE_LABELS_PREFIX + 
"from-config") or some name you prefer -- At least it shouldn't be a prefix.

8) TestEventFlow:
Just pass a null for nodeLabelsProvider not works?

9) ResourceTrackerService:
{code}
+isDecentralizedNodeLabelsConf = conf.getBoolean(
+YarnConfiguration.ENABLE_DECENTRALIZED_NODELABEL_CONFIGURATION, false);
{code}
Avoid hardcode config default here as suggested above.

It no need to send shutdown message when any of the labels not accepted by 
RMNodeLabelsManager. Just add them to a reject node labels list, and add 
diagnostic message should be enough.

{code}
++ ", assigned nodeId " + nodeId + ", node labels { "
++ nodeLabels.toString()+" } ";
{code}
You should use StringUtils.join when you want to get a set of labels to String, 
set.toString() not defined

More comments will be added when you addressed above comments and added tests 
for them.

Thanks,
Wangda


> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  I

[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182473#comment-14182473
 ] 

Wangda Tan commented on YARN-2495:
--

bq. But currently i have NodeLabelProvider as service (as per your earlier 
comments ) and is part of NodeStatus updater, so planning to have some dummy 
Service which doesn't give any node labels and is static
I'm not very sure about what's the dummy service used for, is it for test 
purpose? I think if we don't enable decentralized node label configuration, we 
don't need initialize such service at all, NodeStatusUpdater will only get 
labels from it if decentralized enabled. If it is for test purpose, that should 
be fine :)

Thanks,

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182306#comment-14182306
 ] 

Naganarasimha G R commented on YARN-2495:
-

Hi [~wangda] 
"ENABLE_DECENTRALIZED_NODELABEL_CONFIGURATION" is fine and will work on that. 
And will first focus on NM stuffs and ResourceTracker changes in RM. But 
currently i have NodeLabelProvider as service (as per your earlier comments ) 
and is part of NodeStatus updater, so planning to have some dummy Service which 
doesn't give any node labels and is static, 

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182257#comment-14182257
 ] 

Wangda Tan commented on YARN-2495:
--

[~Naganarasimha],
One comment before you uploading patch:
I suggest to have an option to indicate if currently is using decentralized 
node label configuration or not. If it is true, NM will do following steps like 
create NodeLabelProvider, setup labels in NodeHeadrbeatRequest, etc.  
If you think that is make sense to you, I suggest we can call it 
"ENABLE_DECENTRALIZED_NODELABEL_CONFIGURATION" -> 
(yarn.node-labels.decentralized-configuration.enabled), or do you have another 
suggestions?
And also, that value will be used by RM, RM need do similar things like disable 
admin change labels on nodes via RM admin CLI, etc. I think you can first focus 
on NM stuffs and ResourceTracker changes in RM. AdminService related changes 
can be split to another JIRA.

Thanks,
Wangda

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181859#comment-14181859
 ] 

Wangda Tan commented on YARN-2495:
--

Hi Naga,
bq. you meant NodeHeartBeatResponse right ?
Yes

Looking forward your patch.

Wangda

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181771#comment-14181771
 ] 

Naganarasimha G R commented on YARN-2495:
-

hi [~wangda]
Actually what i meant was update the HeartBeatResponse abt the labels 
acceptance by RM and once NodeStatusUpdater gets response(+ve or -ve) from RM 
then it can set LabelsProvider with approp flag. But your logic seems to be 
much better because i was handling thread sync unnecessarly in 
ConfNodeLabelsProvider. Having this logic in Node status updater removes the 
burden of each type of NodeLabelsProvider to have this sync logic and interface 
will be simple in   NodeLabelsProvider (earlier my thinking was labels should 
not be handled by NodeStatusUpdater hence kept in nodeLabelsprovider)
Actually was about the upload the patch with my logic, as its not as per your 
latest comments i will upload one more by tomorrow afternoon(IST) after 
correction as per your comments 

bq. Add a reject node labels list in NodeHeartbeatRequest – we may not have to 
handle this list for now. But we can keep it on the interface
you meant NodeHeartBeatResponse right ?

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181646#comment-14181646
 ] 

Wangda Tan commented on YARN-2495:
--

1) 
bq. But was thinking about one sceanario labels got changed and on call to 
NodeLabelsProvider.getLabels() it returns the new labels but the heartbeat 
failed due to some reason. 
If heartbeat failed, the resource tracker in NM side cannot get 
NodeHeartbeatResponse. But I'm thinking another case is, labels reported by NMs 
can be invalid and rejected by RM. NM should be notified about such cases.

So I would suggest do this way:
- Keep getNodeLabels in NodeHeartbeatRequest and RegisterNodeManagerRequest.
- Add a reject node labels list in NodeHeartbeatRequest -- we may not have to 
handle this list for now. But we can keep it on the interface
- Add a "lastNodeLabels" in NodeStatusUpdater, it will save last node labels 
list get from NodeLabelFetcher. And in the while loop of 
{{startStatusUpdater}}, we will check if the new list fetched from 
NodeLabelFetcher is different from our last node labels list. If different, we 
will set it, if same, we will skip and set the labels to be null in next 
heartbeat.

And the interface of NodeLabelsProvider should be simple, just a 
getNodeLabels(), NodeStatusUpdater will take care other stuffs.

2)
bq. and for If it's distributed, AdminService should disable admin change 
labels on nodes via RM admin CLI will add a jira, but was wondering how to do 
this ? by configuration with new parameter?
Yes, we should add a new parameter for it, we may not need have this 
immediately, but we should have one in the future. 

bq. I was earlier under the impression as MemoryRMNodeLabelsManager => is for 
distributed Configuration and RMNodeLabelsManager is for Centrallized 
configuration. and some factory will take care of this
Not really, the different between them is one will persist labels to filesystem 
and one not. We still have to do something for the distributed configuration.

Any thoughts? [~vinodkv]

Thanks,
Wangda

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-23 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181304#comment-14181304
 ] 

Naganarasimha G R commented on YARN-2495:
-

Thanks for reviewing Wangda :
bq. 2) It seems NM_LABELS_FETCH_INTERVAL_MS not been used in the patch, did you 
forget to do that?
-- Earlier was planning to make node labels script only to be dynamic and 
configruation based as static. Now based on your comment 4 will make it dynamic 
and change the configuration name too.

bq. 3) Regarding ResourceTrackerProtocol, I think NodeHeartbeatRequest should 
only report labels when labels changed. So there're 3 possible values of node 
labels in NodeHeartbeatRequest ... And RegisterNodeManagerRequest should report 
label every time registering.
-- Yes this was my plan and will be doing it in the same way. 
But was thinking about one sceanario labels got changed and on call to 
NodeLabelsProvider.getLabels() it returns the new labels but the heartbeat 
failed due to some reason. in that case NodeLabelsProvider will not be able to 
detect this and on next request to  getLabels() it will return null. So we 
should have some mechanism such that NodeLabelsProvider are informed whether RM 
accepted the change in labels so that appropriate SET of labels are provided on 
call to getLabels (also if needed we can have RM Rejected Labels too for 
logging purpose)
Planning to have 3 interfaces in NodeLabelsProvider
* getNodeLabels() : to get the labels which can be used for 
registration 
* getNodeLabelsOnModify() :  to get the labels on modification which 
can be used for heartbeat
* rmUpdateNodeLabelsStatus(boolean success) : to indicate that next 
call to getNodeLabelsOnModify can be reset to null

bq. 4.1 Why this class extends from CompositeService? Did you want to add more 
component to it? If not, AbstractService should be enough. If the purpose of 
the NodeLabelsFetcherService is only create a NodeLabelsProvider, and the 
NodeLabelsProvider will take care of periodically read configuration from 
yarn-site.xml.I suggest to rename NodeLabelsFetcherService to 
NodeLabelsProviderFactory, and not extends from any Service, because the 
NodeLabelsProvider should be a Service. Rename NodeLabelsProvider to 
NodeLabelsProviderService if your purpose is as what I mentioned.
-- Your idea seems to be better, will try to do it in the way you have 
specified and hence NodeLabelsFetcherService will become factory or i will make 
it absolute.
ConfigurationNodeLabelsProvider : will make it dynamic. i,e. 
periodically it will read the yarn-site and get the Labels.
{quote}
 6) More implementation suggestions:
Since we need central node labels configuration, I suggest to leverage what we 
already have in RM admin CLI directly – user can use RM admin CLI add/remove 
node labels. We can disable this when we're ready to do non-central node label 
configuration.And there should be an option to tell if distributed node label 
configuration is used. If it's distributed, AdminService should disable admin 
change labels on nodes via RM admin CLI. I suggest to do this in a separated 
JIRA.
{quote}
-- I presume "central node labels configuration" as "Cluster Valid Node Labels" 
stored at RM side for validation of labels if so ok will do it in the same way 
as that of RM Admin CLI
and for ??If it's distributed, AdminService should disable admin change labels 
on nodes via RM admin CLI?? will add a jira, 
but was wondering how to do this ? by configuration with new parameter? I was 
earlier under the impression as MemoryRMNodeLabelsManager => is for distributed 
Configuration and RMNodeLabelsManager is for Centrallized configuration. and 
some factory will take care of this

Other comments will handle

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-22 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181031#comment-14181031
 ] 

Wangda Tan commented on YARN-2495:
--

1) Please remove all script related configurations in YarnConfiguration

2) It seems NM_LABELS_FETCH_INTERVAL_MS not been used in the patch, did you 
forget to do that?

3) Regarding ResourceTrackerProtocol, I think NodeHeartbeatRequest should only 
report labels when labels changed. So there're 3 possible values of node labels 
in NodeHeartbeatRequest:
   1. null: labels not changed
   2. empty array: no label on this node
   3. non-empty array: labels on this node
   And RegisterNodeManagerRequest should report label every time registering.

4) NodeLabelsFetcherService:
4.1 Why this class extends from CompositeService? Did you want to add more 
component to it? If not, AbstractService should be enough. If the purpose of 
the NodeLabelsFetcherService is only create a NodeLabelsProvider, and the 
NodeLabelsProvider will take care of periodically read configuration from 
yarn-site.xml. I suggest to rename NodeLabelsFetcherService to 
NodeLabelsProviderFactory, and not extends from any Service, because the 
NodeLabelsProvider should be a Service. Rename NodeLabelsProvider to 
NodeLabelsProviderService if your purpose is as what I mentioned.

4.2 This should be located in 
org.apache.hadoop.yarn.server.nodemanager.nodelabels, and rename 
"org.apache.hadoop.yarn.server.nodemanager.nodelabel" to 
"org.apache.hadoop.yarn.server.nodemanager.nodelabels" in other class definition

5) ConfigurationNodeLabelsProvider
This should be an independent class and located in a individual file

6) More implementation suggestions:
Since we need central node labels configuration, I suggest to leverage what we 
already have in RM admin CLI directly -- user can use RM admin CLI add/remove 
node labels. We can disable this when we're ready to do non-central node label 
configuration.
And there should be an option to tell if distributed node label configuration 
is used. If it's distributed, AdminService should disable admin change labels 
on nodes via RM admin CLI. I suggest to do this in a separated JIRA.

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-22 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179673#comment-14179673
 ] 

Wangda Tan commented on YARN-2495:
--

[~Naganarasimha], [~aw], let me first give you an overview about what we need 
to do to support labels in capacity scheduler, that will help you better 
understanding why we need central node label validation now. 
In existing capacity scheduler (patch of YARN-2496), we can support specify 
what labels of each queue can access (to make sure important resource can only 
be used by privileged users), and proportion of resource on label ("marketing" 
queue can access 80% of GPU resource). Now if user want to leverage change of 
capacity scheduler, user *MUST* specify 1) labels can be accessed by the queue 
and 2) proportion of resource can be accessed by a queue of each label.
Back to the central node label validation discussion, without this, we cannot 
get capacity scheduler work for now. (user cannot specify capacity for a 
unknown node-label for a queue, etc.). So I still insist to have central node 
label valication for both centralized/distribtued node label configuration at 
least for 2.6 release. This might be changed in the future, I suggest to move 
disable central node label configuration to a separated task for further 
discussions.

And I've looked at patch uploaded by [~Naganarasimha], thanks for this WIP 
patch, took a quick glance at the patch, several suggestions on this patch:
- According to above comments, do not change {{CommonNodeLabelsManager}}, move 
the changes to disable central node label validation to a separated patch for 
further discussion. 
- Make this patch contains a {{NodeLabelProvider}} only and create separate 
JIRA for {{ScriptNodeLabelProvider}} and an implementation to read node label 
from yarn-site.xml for easier review.

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-21 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179507#comment-14179507
 ] 

Vinod Kumar Vavilapalli commented on YARN-2495:
---

This is very useful to get in for 2.6, [~leftnoteasy]/[~Naganarasimha] how 
feasible is it?

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178762#comment-14178762
 ] 

Allen Wittenauer commented on YARN-2495:


bq. while modifying the script he will be able to configure the valid labels 
too.

The script can be updated *independently* of changing the running configuration 
files.  Changing the xml comfigs will also require a *coordinated* reconfigure 
of the RM.  That isn't realistic, especially for things such as rolling 
upgrades. HARM, of course, makes the situation even worse. Additionally, I'm 
sure the label validation code will spam the RM logs every time it gets an 
invalid label, which is pretty much a "please fill the log directory" action.

The *only* scenario I can think of where label validation has a practical use 
is if AMs and/or containers are allowed to inject labels.  But that should be a 
different control structure altogether and have zero impact on administrator 
controlled labels.

bq. Seems like maintenance wise it might become difficult for example,

Label validation actually makes your example worse because now the labels 
disappear completely.  Is it a problem with the script or is it a problem with 
the label definition?

bq.  i feel centralized Label validation can be made configurable. Please 
provide opinion on this.

Just disable it completely.  I'm still waiting to hear what practical 
application this bug would have.

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178608#comment-14178608
 ] 

Naganarasimha G R commented on YARN-2495:
-

Hi [~aw] ,
bq. I don't think you understand the use case at all. In fact, it's clear you 
need to re-read the sample script. It does not get updated with every new JDK. 
It's smart enough to update the label regardless of the JDK that is installed...
I meant like script needs to be modified for new label set for example, 
currently admin as configured for JDK Labels and further if he wanted to add 
label related to some native lib version, *As admin will knows all the valid 
native lib versions in the system(or can be automated to get this list) while 
modifying the script he will be able to configure the valid labels too*.

bq. which means the only friction to operations is point is going to be 
updating this 'valid label list' on the RM.
Seems like maintenance wise it might become difficult  for example, once the 
valid JDK labels are loaded admins will forget about this feature and later on 
based on the req, some other admin/person might update the JDK and he might not 
be aware about such a script exists which updates the labels based on JDK or 
native libs version. So he might miss to update the valid labels and that node 
might not be useful or wrong labels will will be tagged to it as new labels are 
not updated.

So i feel Allen's scenario needs to be addressed. As [~Wangda] suggested i feel 
centralized Label validation can be made configurable. Please provide opinion 
on this.

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174584#comment-14174584
 ] 

Allen Wittenauer commented on YARN-2495:


bq. I understood the use case but what i did not understand is how would it 
restrict/deter a user, as he can do one more updation ; one more label to the 
central valid label list, like java version or jdk version etc. As anyway 
script will be written/updated to get specific set of labels so i feel in most 
cases admin can know what lables will be coming in the cluster. Any other use 
case where it will be difficult for admin to list the labels before hand ?

I don't think you understand the use case at all.  In fact, it's clear you need 
to re-read the sample script.  It does *not* get updated with every new JDK.  
It's smart enough to update the label regardless of the JDK that is 
installed... which means the *only* friction to operations is point is going to 
be updating this 'valid label list' on the RM.  

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-14 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171027#comment-14171027
 ] 

Naganarasimha G R commented on YARN-2495:
-

Thanks [~aw],[~wangda] & [~sunilg],

*For [~aw] comments:*
bq. I don't fully understand your question, but I'll admit I've been distracted 
with other JIRAs lately.
You got first part of my question right and thanks for detailing the scenario

bq. If we are rolling out a new version of the JDK, I shouldn't have to tell 
the system that it's ok to broadcast that JDK version first.
I understood the use case but what i did not understand is how would it 
restrict/deter a user, as he can do one more updation ; one more label to the 
central valid label list, like java version or jdk version etc. As anyway 
script will be written/updated to get specific set of labels so i feel in most 
cases admin can know what lables will be coming in the cluster. Any other use 
case where it will be difficult for admin to list the labels before hand  ?

*For [~wangda] comments:*
bq. they will be reported to RM when NM registration. We may not need to 
persist any of them, but RM should know these labels existence to do scheduling.
Does RM needs to have all the list of valid labels even before registering of 
all nodes are done ? How will it impact scheduling ? How is it different from 
Central configuration ? As in centralized config; user needs to update the new 
labels and then send the node to lables mapping . similarly in distributed 
config first we can find out the new labels and update the super set list of 
lables and then update the label mapping for a node which wants update or 
modify labels.

bq. Another question is if we need check labels when they registering, I prefer 
to pre-set them because this affects scheduling behavior. For example, the 
maximum-resource and minimum-resource are setup in RM side, and RackResolver is 
also run in RM side
May be this i did not get it correctly.  You mean when NM is registering for 
the first time after startup, you want it have preset apart from what is read 
from NM's yarn-site.xml/script ? did not get this clearly please elaborate. 

bq. At least, the label checking should be kept configurable in distributed 
mode. – "just ignore all the labels for that node if invalid labels exists" 
might be a good way when it enabled.
in your earlier stmt you said it affects scheduling, if so then if its kept 
configurable then how will that solve ? But what was clear was 
* Support to add and remove Valid label and centralized level is required
* RM will do the label validation on NM registraion & heartbeat
* If while validating (during NM registraion & heartbeat) if one of the labels 
fail for a given node. then we will just ignore all the labels for that node.

*For [~sunilg] comments:*
bq. If any such node label is invalid as per RM, then how this will be reported 
back to NM? Error Handling?
I too have the same doubt and feel that usability will be reduced as script is 
executed some where and the validations are happening some where, if error is 
not propagated back to NM. 

 bq. But imagine a 1000 node cluster, and then with changing labels per 
heartbeat, will this be a bottleneck?
we will not be changing label for every heart beat i will try to ensure that 
during heartbeat only if the labels have changed from previous set of labels 
for a node only then it will send the updated label set.  But issue will be 
there that lot of contention will happen suppose some script is modified and 
all 2000 nodes want to update their labels

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170860#comment-14170860
 ] 

Sunil G commented on YARN-2495:
---

Hi All.

I have a doubt here, 

1. In distributed configuration, each NM can specify label in 
register/heartbeat(update). I am not sure check for "Valid Label" to happen in 
RM or NM. As per the current design, it looks like all valid checks are 
happening at RM. 
If any such node label is invalid as per RM, then how this will be reported 
back to NM? Error Handling?

2. If possible to change label at run time from NM, i think same existing 
interfaces are used (heartbeat). Do you feel this check will happen may be more 
frequent in RM than in a Centralized configuration? In centralized config, some 
command will be fired by admin to change labels. This may not be frequent. But 
imagine a 1000 node cluster, and then with changing labels per heartbeat, will 
this be a bottleneck?


> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169984#comment-14169984
 ] 

Wangda Tan commented on YARN-2495:
--

Hi [~Naganarasimha],
First I think need to take care is, distributed configuration should be a way 
to input and store node labels configuration. In centralized configuration, 
user can input labels and node to labels mappings by REST API and RM admin CLI, 
and they will be persist to some file system like HDFS/ZK.

For distributed configuration, node labels are input by NMs (either by set 
yarn-site.xml or some script), and they will be reported to RM when NM 
registration. We may not need to persist any of them, but RM should know these 
labels existence to do scheduling.

Another question is if we need check labels when they registering, I prefer to 
pre-set them because this affects scheduling behavior. For example, the 
maximum-resource and minimum-resource are setup in RM side, and RackResolver is 
also run in RM side. At least, the label checking should be kept configurable 
in distributed mode. -- "just ignore all the labels for that node if invalid 
labels exists" might be a good way when it enabled.

For implementation, I suggest you can take a look at YARN-2494 and 
YARN-2496/YARN-2500, the distributed configuration should be an extension of 
NodeLabelsManager, and YARN-2496/YARN-2500 shouldn't be modified too much to 
support the distributed mode in scheduling.

Thanks,
Wangda

> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169528#comment-14169528
 ] 

Allen Wittenauer commented on YARN-2495:


I don't fully understand your question, but I'll admit I've been distracted 
with other JIRAs lately.  Hopefully I'm guessing correctly though... :)

The idea that there are 'valid' labels and that one turns them on/off seems to 
conflict with the goals that I want to see fulfilled with this feature.  Let me 
give a more concrete example, since I know a lot of people are confused as to 
why labels should be dynamic anyway:

Here's node A's configuration:

{code}
$ hadoop version
Hadoop 2.6.0
Subversion http://github.com/apache/hadoop -r 
1e6d81a8869ceeb2f0f81f2ee4b89833f2b22cd4
Compiled by aw on 2014-10-10T19:31Z
Compiled with protoc 2.5.0
>From source with checksum 201dad5b10939faa6e5841378c8c94
This command was run using 
/sw/hadoop/hadoop-2.6.0-SNAPSHOT/share/hadoop/common/hadoop-common-2.6.0-SNAPSHOT.jar
$ java -version
java version "1.6.0_32"
OpenJDK Runtime Environment (IcedTea6 1.13.4) (rhel-7.1.13.4.el6_5-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
{code}

Here's node B's configuration:
{code}
$ ~/HADOOP/hadoop-3.0.0-SNAPSHOT/bin/hadoop version
Hadoop 3.0.0
Source code repository https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
7b29f99ad23b2a87eac17fdcc7b5b29cd6c9b0c0
Compiled by aw on 2014-10-08T21:12Z
Compiled with protoc 2.5.0
>From source with checksum ae703b1a38a35d19f4584495dc31944
This command was run using 
/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar
$ java -version
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
{code}

(Ignore the fact that they probably can't be in the same cluster. That's not 
really relevant.)

I want to provide a script similar to this one:
{code}
#!/usr/bin/env bash

HADOOPVERSION=$(${HADOOP_PREFIX}/bin/hadoop version 2>/dev/null | /usr/bin/head 
-1 |/usr/bin/cut -f2 -d\ )
JAVAVERSION=$(${JAVA_HOME}/bin/java -version 2>&1 |/usr/bin/head -1| 
/usr/bin/cut -f2 -d\" )
JAVAMMM=$(echo ${JAVAVERSION}| /usr/bin/cut -f1 -d_)

echo "LABELS=jdk${JAVAVERSION},jdk${JAVAMMM},hadoop_${HADOOPVERSION}"
{code}

This script, when run, should tell the RM that:

Node A has labels 1.6.0_32, jdk1.6.0, and hadoop_2.6.0 .
Node B has labels LABELS=jdk1.7.0_67, jdk1.7.0, and hadoop_3.0.0

Users should be able to submit a job that specifies that it only runs with 
'hadoop_3.0.0'.  Or that it requires 'jdk_1.7.0'.

Making labels either valid or invalid based upon a central configuration 
defeats the purpose of having a living, breathing cluster able to adapt to 
change without operations intervention.  If we are rolling out a new version of 
the JDK, I shouldn't have to tell the system that it's ok to broadcast that JDK 
version first.


> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels in each NM (Distributed configuration)

2014-10-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169211#comment-14169211
 ] 

Naganarasimha G R commented on YARN-2495:
-

hi [~wangda] & [~aw] ,
Few queries :
Some Requirements which make more sense for Centralized Configuration and not 
for Distributed configuration
* ADD_LABELS (add valid labels which can be assigned to nodes of 
cluster)
* REMOVE_LABELS (Remove from valid labels of cluster)

As by configuration in yarnsite or dynamic scripting we are setting the labels 
for each Node, My opinion is to not support for distributed configuration for 
above cluster labels.
But If we require to support this 
* as the valid lables information is only available with the RM, after 
the node sends heartbeat RM retains only valid labels for that node( log if any 
invalid labels) or we can just ignore all the labels for that node if invalid 
labels exists
* We need to store the valid cluster node lables in the file but need 
not store labels for cluster nodes as during every registration nodes label can 
be got.




> Allow admin specify labels in each NM (Distributed configuration)
> -
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)