[ https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106419#comment-17106419 ]
Yiqun Lin commented on HDFS-15340: ---------------------------------- The comment makes sense to me, thanks [~LiJinglun]. And the latest parch almost looks good now. {quote}The reason passing the lastProcedure is having a context between upstream and downstream. For example supposing we have a topology below, A might go to either B1 or B2. With the last procedure C can have different behaviors. Let me know your thoughts. I'm also ok to remove it, in v05 I removed the lastProcedure. {quote} In current fed discp balance scenario, we don't really need to have a context with upstream I think. So I'm +1 to remove lastProcedure. {quote}I think comparing job id, procedureTable and firstProcedure would be enough because they are immutable. The other infos like curProcedure, lastProcedure and jobDone change while the job runs. The initial job object might have different curProcedure, lastProcedure and jobDone with the recovered job object. But they are representing the same job {quote} Yes, I agree that comparing with job id, procedureTable and firstProcedure should enough. Here my point is that we need to verify that the job is correctly be recovered from HDFS journal, including last stage info that job has finished successfully. Then the recovered job can continue to next stage to run. In current ut, following check is not enough, we need to add last procedure stage assert check as well. {code:java} assertNotSame(job, recoverJob); assertEquals(job, recoverJob) {code} In addition, can you add some necessary comment for method BalanceJournalInfoHDFS#saveJob to describe about hdfs journal file layout? This can let us quickly know how job journal files are organized in HDFS. It takes me some time to understand this :D. Can you rename these two method names as well? private Path getNextJob(BalanceJob job) -> getNewStateJobPath private Path getLatestJob(BalanceJob job) -> getLatestStateJobPath > RBF: Implement BalanceProcedureScheduler basic framework > -------------------------------------------------------- > > Key: HDFS-15340 > URL: https://issues.apache.org/jira/browse/HDFS-15340 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Jinglun > Assignee: Jinglun > Priority: Major > Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, > HDFS-15340.003.patch, HDFS-15340.004.patch, HDFS-15340.005.patch, > HDFS-15340.006.patch > > > Patch in HDFS-15294 is too big to review so we split it into 2 patches. This > is the first one. Detail can be found at HDFS-15294. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org