[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129117#comment-14129117 ] Hitesh Shah commented on TEZ-850: - Comments on patch: - needs a minor rebase - MockVertexImpl does not seem to be used - recoveredState from restoreFromEvent() is not used/asserted against - new warnings introduced in TestDAGRecovery and TestVertexRecovery Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850-3.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129566#comment-14129566 ] Jeff Zhang commented on TEZ-850: Attach the new patch. * rebase it * remove MockVertexImpl which is not used * add verification of recoveredState from restoreFromEvent() in TestVertexRecovery * no warning of TestDAGRecovery and TestVertexRecovery in eclipse auto check and when calling mvn test-compile, Let me know if we need to do other warning check. Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850-3.patch, Tez-850-4.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129596#comment-14129596 ] Hitesh Shah commented on TEZ-850: - There were minor warnings for example: - instead of using for (VertexEvent e: events), there was a for (int i = 0 , i 2) loop being used - For, restoreFromEvent, the return value was not being used nor was it being verified using assert. - a couple of places related to checking output committers had warnings on the value possibly being null Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850-3.patch, Tez-850-4.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129641#comment-14129641 ] Jeff Zhang commented on TEZ-850: [~hitesh] Update the patch removing the //TODO Maybe it is related with [TEZ-1404|https://issues.apache.org/jira/browse/TEZ-1404] which is resovled. Can't remind clearly. :( Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850-3.patch, Tez-850-4.patch, Tez-850-5.patch, Tez-850-6.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127356#comment-14127356 ] Hitesh Shah commented on TEZ-850: - Mostly looks fine. Couple of minor comments: - for fields changed from private to package public - please add a VisibleForTesting annotation - Also, for certain critical events that are logged out of band, we need some additional tests to test scenarios where the out of band event was logged in summary file but the main event was not logged in dag recovery file. This could be done as a separate jira. Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127876#comment-14127876 ] Jeff Zhang commented on TEZ-850: Attach the new patch (add VisibleForTesting annotation) Create jira for unit test of critical events that are logged out of band [TEZ_1561|https://issues.apache.org/jira/browse/TEZ-1561] Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850-2.patch, Tez-850-3.patch, Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102070#comment-14102070 ] Jeff Zhang commented on TEZ-850: [~hitesh] Attach the patch. * Recovery Unit test for DAG, Vertex, Task, TaskAttempt, functional test like TestFaultTolerance is not included. Will add it in the next patch. * Each unit test follow the patterns ( restoreFromHistoryEvent ... - RecoverTransition ). The unit test in [Tez-1024|https://issues.apache.org/jira/browse/TEZ-1024] is conslidated into TestTaskRecovery Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Attachments: Tez-850.patch Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095078#comment-14095078 ] Hitesh Shah commented on TEZ-850: - There may be other cases that may require running jobs similar to the model used in TestFaultTolerance. In such cases, we would write special jobs with custom vertex managers that can induce failures at certain points. Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (TEZ-850) Recovery unit tests
[ https://issues.apache.org/jira/browse/TEZ-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090416#comment-14090416 ] Jeff Zhang commented on TEZ-850: [~hitesh], I have done some work on this. But considering this would be a large patch, want to describe my overall design and hope to get your comments first. Here all the recovery unit test is for the recovery process , not including logging recovery event and parsing recovery file. Overall design is that first restore the entity from HistoryEvent and then call the RecoverTransition. One sample test case for Recovery of DAG is like this {code} restoreFromDAGInitializedEvent - restoreFromDAGStartedEvent - restoreFromDAGFinished (SUCCEED) - RecoverTransition {code} In each step, I will verify whether the status fields are updated and in the correct state. Recovery unit tests --- Key: TEZ-850 URL: https://issues.apache.org/jira/browse/TEZ-850 Project: Apache Tez Issue Type: Sub-task Reporter: Hitesh Shah Assignee: Jeff Zhang Tests for custom edge managers, groups handling, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)