[jira] [Work logged] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

ASF GitHub Bot (Jira) Thu, 19 Jan 2023 22:02:10 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-26599?focusedWorklogId=840511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-840511
 ]


ASF GitHub Bot logged work on HIVE-26599:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jan/23 06:01
            Start Date: 20/Jan/23 06:01
    Worklog Time Spent: 10m 
      Work Description: pudidic commented on code in PR #3963:
URL: https://github.com/apache/hive/pull/3963#discussion_r1082136015


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java:
##########
@@ -865,6 +865,10 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
       if (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_FAILOVER_START)) {
         work.getMetricCollector().reportFailoverStart(getName(), metricMap, 
work.getFailoverMetadata());
       } else {
+        int size = tablesForBootstrap.size();
+        if(size > 0) {

Review Comment:
   Please follow the coding convention with a whitespace after if.



##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOptimisedBootstrap.java:
##########
@@ -906,6 +912,82 @@ public void testOverwriteDuringBootstrap() throws 
Throwable {
         .verifyFailure(new String[]{"tnew_managed"});
   }
 
+  @Test
+  public void testTblMetricRegisterDuringSecondCycleOfOptimizedBootstrap() 
throws Throwable{

Review Comment:
   Please follow the coding convention by adding a whitespace after Throwable.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 840511)
    Time Spent: 40m  (was: 0.5h)

> Fix NPE encountered in second dump cycle of optimised bootstrap
> ---------------------------------------------------------------
>
>                 Key: HIVE-26599
>                 URL: https://issues.apache.org/jira/browse/HIVE-26599
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Teddy Choi
>            Assignee: Vinit Patni
>            Priority: Blocker
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, But We are encountering Null 
> pointer exception in the second dump cycle which is halting this reverse 
> replication and major blocker to test complete cycle of replication. 
> {code:java}
> Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: 
> Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code}
> After doing RCA, we figured out that In second dump cycle on DR cluster when 
> StageStart method is invoked by code,  metrics corresponding to Tables is not 
> being registered (which should be registered as we are doing selective 
> bootstrap of tables for optimise bootstrap along with incremental dump) which 
> is causing NPE when it is trying to update the progress corresponding to this 
> metric latter on after bootstrap of table is completed. 
> Fix is to register the Tables metric before updating the progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

Reply via email to