[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
- Containers Running: -19
- Memory Used: -38GB
- Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> - Containers Running: -19
> - Memory Used: -38GB
> - Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Attachment: mitigating2.5.1.diff

> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating01.patch).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Attachment: Example.log-cut

> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating01.patch).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating01.patch).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating01.patch).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
I checked their resource manager logs.
These events happened.
Assigned container: 67019 times
Released container: 67019 times
Invalid container released: 19 times


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
I checked their resource manager logs.
These events happened.
Assigned container: 67019 times
Released container: 67019 times
Invalid container released: 19 times

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> I checked their resource manager logs.
> These events happened.
> Assigned container: 67019 times
> Released container: 67019 times
> Invalid container released: 19 times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)