[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

Yiqun Lin (Jira) Tue, 12 May 2020 09:13:23 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105563#comment-17105563
 ]


Yiqun Lin commented on HDFS-15340:
----------------------------------

Thanks for addressing the comments, [~LiJinglun].

Some places are confused to understand, also you need to take care for some 
typos, :).

Some more review comments:

*BalanceJob.java*
 Typo: {{Each procedure need}} --> {{Each procedure needs}}
 Typo: jos -> job

Can you rewrite this:
{noformat}
Start procedure {}. The last procedure is ... -- > Start procedure {}, last 
procedure is 
{noformat}
How about adding sleep interval time when curProcedure.execute returning false 
that means curProcedure executed failed? It can avoid frequently executing 
failed procedure.
{noformat}
if (curProcedure.execute(lastProcedure)) {
+            lastProcedure = curProcedure;
+            curProcedure = next();
+          }
{noformat}
The balancer job is same, why we always write its journal info to HDFS? Only 
once time is enough I think.
{noformat}
if (!scheduler.writeJournal(this)) {
+            quit = true; // Write journal failed. Simply quit because this job
+                         // has already been added to the recoverQueue.
+            LOG.debug("Write journal failed. Quit and wait for recovery.");
+          }
{noformat}
I see this is also used for testing, can you add @VisibleForTesting annotation 
for this?
{noformat}
+  public boolean removeAfterDone() {
+    return removeAfterDone;
+  }{noformat}
It would be better to add comment for this method, we don't know why we pass 
exception here.
{code:java}
private synchronized void finish(Exception exception)
{code}
*BalanceJournalInfoHDFS.java*
 Found many minor grammar rule issues, can you have a quick fix?
 * "list all jobs from journal"; – > "List all jobs from journal"
 * Need to leave one white space: builder.append(",") -> builder.append(", ");
 * clear journal of job - > Clear journal of job

*BalanceProcedure.java*
{code:java}
+  /**
+   * The main process. This is called by the ProcedureScheduler.
+
+   * Make sure the process quits fast when it's interrupted and the scheduler 
is
+   * shut down.
+   *
+   * @param lastProcedure the last procedure.
+   * @throws RetryException if this procedure needs delay a while then retry.
+   * @return true if the procedure has done and the job will go to the next
+   *         procedure, otherwise false.
+   */
+  public abstract boolean execute(T lastProcedure)
+      throws RetryException, IOException;
{code}
lastProcedure here is only used for testing, I suggest to remove this as an 
input parameter. It seems too confused that we pass lastProcedure but do 
nothing in actual BalanceProcedure class. The major function methods need be 
clear for others to understand, :).

*BalanceProcedureScheduler.java*
 For the elapse time calculation on Hadoop world, we will use 
Time.monotonicNow() not Time.now or System.currentTimeMillis(). Can you update 
for this?
{noformat}
this.time = Time.now() + delayInMilliseconds;
long delay = time - System.currentTimeMillis();
{noformat}
*UnrecoverableProcedure.java*
{code:java}
+  @Override
+  public boolean execute(BalanceProcedure lastProcedure) throws RetryException,
+      IOException {
+    if (handler != null) {
+      return handler.execute(lastProcedure);
+    } else {
+      return true;
+    }
+  }
{code}
We could use mock to throw exception, not depend on BalanceProcedure passed to 
throw exception. So lastProcedure can completely removed in execute method.

> RBF: Balance data across federation namespaces with DistCp and snapshot diff 
> / Step 1: The State Machine(BalanceProcedureScheduler)
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15340
>                 URL: https://issues.apache.org/jira/browse/HDFS-15340
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-15340.001.patch, HDFS-15340.002.patch, 
> HDFS-15340.003.patch, HDFS-15340.004.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the first one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15340) RBF: Balance data across federation namespaces with DistCp and snapshot diff / Step 1: The State Machine(BalanceProcedureScheduler)

Reply via email to