[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=401983&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-401983
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 12/Mar/20 06:14
Start Date: 12/Mar/20 06:14
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 401983)
Remaining Estimate: 0h
Time Spent: 10m

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404392
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:40
Start Date: 17/Mar/20 01:40
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393399236
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
 
 Review comment:
   Why is this needed?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404392)
Time Spent: 20m  (was: 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404393
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:42
Start Date: 17/Mar/20 01:42
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393399950
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -397,17 +397,19 @@ public void externalTableWithPartitions() throws 
Throwable {
 primary.run("use " + primaryDbName)
 .run("insert into table t2 partition(country='france') values 
('lyon')")
 .run("alter table t2 set location '" + tmpLocation2 + "'")
-.dump(primaryDbName);
+.dump(primaryDbName, withClause);
 
-replica.load(replicatedDbName, primaryDbName, loadWithClause);
+replica.load(replicatedDbName, primaryDbName, withClause);
 assertTablePartitionLocation(primaryDbName + ".t2", replicatedDbName + 
".t2");
   }
 
   @Test
   public void externalTableIncrementalReplication() throws Throwable {
-WarehouseInstance.Tuple tuple = primary.dumpWithCommand("repl dump " + 
primaryDbName);
+List withClause = externalTableBasePathWithClause();
+String replDumpCommand = "repl dump " + primaryDbName
++ " WITH (" + withClause.get(0) + "," + withClause.get(1) + ")";
+WarehouseInstance.Tuple tuple = primary.dumpWithCommand(replDumpCommand);
 
 Review comment:
   can use withClause directly
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404393)
Time Spent: 0.5h  (was: 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404395&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404395
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:44
Start Date: 17/Mar/20 01:44
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393400307
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   Why is there no data here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404395)
Time Spent: 40m  (was: 0.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404396
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:45
Start Date: 17/Mar/20 01:45
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393400475
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -713,9 +732,11 @@ public void testExternalTableDataPath() throws Exception {
 
   @Test
   public void testExternalTablesIncReplicationWithConcurrentDropTable() throws 
Throwable {
-List dumpWithClause = Collections.singletonList(
-"'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'"
-);
+List dumpWithClause = Arrays.asList(
 
 Review comment:
   Can be separated in a common method
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404396)
Time Spent: 50m  (was: 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404397
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:48
Start Date: 17/Mar/20 01:48
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393401308
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Can be added to a separate method
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404397)
Time Spent: 1h  (was: 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404398
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:52
Start Date: 17/Mar/20 01:52
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393402297
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,25 +127,31 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-  ReplChangeManager.getInstance(conf);
-  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-  Long lastReplId;
-  if (!dumpRoot.getFileSystem(conf).exists(dumpRoot)
-  || dumpRoot.getFileSystem(conf).listStatus(dumpRoot).length == 
0) {
-lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  //First Check if external table copy work has been initialized, if so, 
just do that and return.
 
 Review comment:
   Please add more comments here
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404398)
Time Spent: 1h 10m  (was: 1h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404399
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 01:55
Start Date: 17/Mar/20 01:55
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393402843
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -367,13 +379,14 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 // Dump external table locations if required.
 if (TableType.EXTERNAL_TABLE.equals(table.getTableType())
   && shouldDumpExternalTableLocation()) {
-  writer.dataLocationDump(table);
+  extTableLocations.addAll(writer.dataLocationDump(table));
 }
 
 // Dump the table to be bootstrapped if required.
 if (shouldBootstrapDumpTable(table)) {
   HiveWrapper.Tuple tableTuple = new HiveWrapper(hiveDb, 
dbName).table(table);
-  dumpTable(dbName, tableName, validTxnList, dbRoot, dumpRoot, 
bootDumpBeginReplId, hiveDb, tableTuple);
+  replPathMappings.addAll(dumpTable(dbName, tableName, 
validTxnList, dbRoot, dumpRoot, bootDumpBeginReplId,
 
 Review comment:
   If there are too many external tables OOM?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404399)
Time Spent: 1h 20m  (was: 1h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404428
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 03:38
Start Date: 17/Mar/20 03:38
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393427998
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
 
 Review comment:
   Since the external table copy is happening on source, the configs are 
required to be passed at dump.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404428)
Time Spent: 1.5h  (was: 1h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404441
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:19
Start Date: 17/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393436614
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -367,13 +379,14 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 // Dump external table locations if required.
 if (TableType.EXTERNAL_TABLE.equals(table.getTableType())
   && shouldDumpExternalTableLocation()) {
-  writer.dataLocationDump(table);
+  extTableLocations.addAll(writer.dataLocationDump(table));
 }
 
 // Dump the table to be bootstrapped if required.
 if (shouldBootstrapDumpTable(table)) {
   HiveWrapper.Tuple tableTuple = new HiveWrapper(hiveDb, 
dbName).table(table);
-  dumpTable(dbName, tableName, validTxnList, dbRoot, dumpRoot, 
bootDumpBeginReplId, hiveDb, tableTuple);
+  replPathMappings.addAll(dumpTable(dbName, tableName, 
validTxnList, dbRoot, dumpRoot, bootDumpBeginReplId,
 
 Review comment:
   That's the existing issue with the way current iterators are being used and 
the patch doesn't address that. May be later, we may want to solve that in 
general. May be by using some disk based iterators. Will file a tracking JIRA 
for that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404441)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404439&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404439
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:19
Start Date: 17/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393436940
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Will see if we can do that. Problem was that the parameters were not fixed 
and were being set on case-to-case basis.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404439)
Time Spent: 2h  (was: 1h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404438&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404438
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:19
Start Date: 17/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393436153
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,25 +127,31 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-  ReplChangeManager.getInstance(conf);
-  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-  Long lastReplId;
-  if (!dumpRoot.getFileSystem(conf).exists(dumpRoot)
-  || dumpRoot.getFileSystem(conf).listStatus(dumpRoot).length == 
0) {
-lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  //First Check if external table copy work has been initialized, if so, 
just do that and return.
 
 Review comment:
   Will do
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404438)
Time Spent: 1h 50m  (was: 1h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404437
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:19
Start Date: 17/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393432287
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -397,17 +397,19 @@ public void externalTableWithPartitions() throws 
Throwable {
 primary.run("use " + primaryDbName)
 .run("insert into table t2 partition(country='france') values 
('lyon')")
 .run("alter table t2 set location '" + tmpLocation2 + "'")
-.dump(primaryDbName);
+.dump(primaryDbName, withClause);
 
-replica.load(replicatedDbName, primaryDbName, loadWithClause);
+replica.load(replicatedDbName, primaryDbName, withClause);
 assertTablePartitionLocation(primaryDbName + ".t2", replicatedDbName + 
".t2");
   }
 
   @Test
   public void externalTableIncrementalReplication() throws Throwable {
-WarehouseInstance.Tuple tuple = primary.dumpWithCommand("repl dump " + 
primaryDbName);
+List withClause = externalTableBasePathWithClause();
+String replDumpCommand = "repl dump " + primaryDbName
++ " WITH (" + withClause.get(0) + "," + withClause.get(1) + ")";
+WarehouseInstance.Tuple tuple = primary.dumpWithCommand(replDumpCommand);
 
 Review comment:
   Will change
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404437)
Time Spent: 1h 40m  (was: 1.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404440
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:19
Start Date: 17/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393433210
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   Since the modification happened after dump, data will be seen only after 
next dump-load cycle, which is being tested.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404440)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404447&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404447
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:39
Start Date: 17/Mar/20 04:39
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r392815132
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,25 +127,31 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-  ReplChangeManager.getInstance(conf);
-  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-  Long lastReplId;
-  if (!dumpRoot.getFileSystem(conf).exists(dumpRoot)
-  || dumpRoot.getFileSystem(conf).listStatus(dumpRoot).length == 
0) {
-lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  //First Check if external table copy work has been initialized, if so, 
just do that and return.
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
 
 Review comment:
   why cant external table copy task goes in parallel to other tasks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404447)
Time Spent: 2h 10m  (was: 2h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404446&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404446
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 04:39
Start Date: 17/Mar/20 04:39
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r392815969
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -386,8 +399,10 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 }
   }
   dumpTableListToDumpLocation(tableList, dumpRoot, dbName, conf);
+  List extTableCopyWorks = 
dirLocationsToCopy(extTableLocations);
 
 Review comment:
   if extTableLocations is empty ..no need to create the task
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404446)
Time Spent: 2h 10m  (was: 2h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404468
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 06:11
Start Date: 17/Mar/20 06:11
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393462313
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -386,8 +399,10 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
 }
   }
   dumpTableListToDumpLocation(tableList, dumpRoot, dbName, conf);
+  List extTableCopyWorks = 
dirLocationsToCopy(extTableLocations);
 
 Review comment:
   Sure, will fix that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404468)
Time Spent: 2h 20m  (was: 2h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404470&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404470
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 06:13
Start Date: 17/Mar/20 06:13
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393463008
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,25 +127,31 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-  ReplChangeManager.getInstance(conf);
-  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-  Long lastReplId;
-  if (!dumpRoot.getFileSystem(conf).exists(dumpRoot)
-  || dumpRoot.getFileSystem(conf).listStatus(dumpRoot).length == 
0) {
-lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  //First Check if external table copy work has been initialized, if so, 
just do that and return.
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
 
 Review comment:
   We would like to have a control on how many tasks in parallel can run 
including external table copy. That was the existing behavior even in case of 
external table copy.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404470)
Time Spent: 2.5h  (was: 2h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404474
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 06:16
Start Date: 17/Mar/20 06:16
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393463746
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
+"'hive.repl.add.raw.reserved.namespace'='true'",
+"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + 
"'='"
++ replica.externalTableWarehouseRoot + "'",
+"'distcp.options.skipcrccheck'=''",
+"'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + 
"'='false'",
+"'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
++ UserGroupInformation.getCurrentUser().getUserName() 
+"'");
 
 Review comment:
   Please add a test with scheduler as well
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404474)
Time Spent: 2h 40m  (was: 2.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404527
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 08:17
Start Date: 17/Mar/20 08:17
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393506609
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -713,9 +732,11 @@ public void testExternalTableDataPath() throws Exception {
 
   @Test
   public void testExternalTablesIncReplicationWithConcurrentDropTable() throws 
Throwable {
-List dumpWithClause = Collections.singletonList(
-"'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'"
-);
+List dumpWithClause = Arrays.asList(
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404527)
Time Spent: 3h  (was: 2h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404526&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404526
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 08:17
Start Date: 17/Mar/20 08:17
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393506405
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
+"'hive.repl.add.raw.reserved.namespace'='true'",
+"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + 
"'='"
++ replica.externalTableWarehouseRoot + "'",
+"'distcp.options.skipcrccheck'=''",
+"'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + 
"'='false'",
+"'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
++ UserGroupInformation.getCurrentUser().getUserName() 
+"'");
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404526)
Time Spent: 2h 50m  (was: 2h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404528
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 08:18
Start Date: 17/Mar/20 08:18
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393506687
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404528)
Time Spent: 3h 10m  (was: 3h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404747&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404747
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:29
Start Date: 17/Mar/20 14:29
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393660921
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -713,9 +732,11 @@ public void testExternalTableDataPath() throws Exception {
 
   @Test
   public void testExternalTablesIncReplicationWithConcurrentDropTable() throws 
Throwable {
-List dumpWithClause = Collections.singletonList(
-"'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'"
-);
+List dumpWithClause = Arrays.asList(
 
 Review comment:
   The new method is still not used here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404747)
Time Spent: 3.5h  (was: 3h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404749&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404749
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:29
Start Date: 17/Mar/20 14:29
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393662263
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Its still using its own configs. Is this done?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404749)
Time Spent: 3h 40m  (was: 3.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404748&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404748
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:29
Start Date: 17/Mar/20 14:29
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393660435
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +436,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
 
 Review comment:
   How was it getting loaded here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404748)
Time Spent: 3h 40m  (was: 3.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404746&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404746
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:29
Start Date: 17/Mar/20 14:29
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393658865
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
+"'hive.repl.add.raw.reserved.namespace'='true'",
+"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + 
"'='"
++ replica.externalTableWarehouseRoot + "'",
+"'distcp.options.skipcrccheck'=''",
 
 Review comment:
   Why do we need these extra configs
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404746)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404745&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404745
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:29
Start Date: 17/Mar/20 14:29
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393660304
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   How was this getting loaded in the older scenario
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404745)
Time Spent: 3h 20m  (was: 3h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404753&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404753
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:32
Start Date: 17/Mar/20 14:32
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393723340
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   The copy was happening in repl load, change was prior to that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404753)
Time Spent: 3h 50m  (was: 3h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404754
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:33
Start Date: 17/Mar/20 14:33
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393723769
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Yes, I am not planning to refactor all the places.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404754)
Time Spent: 4h  (was: 3h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404755
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 14:34
Start Date: 17/Mar/20 14:34
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393724555
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
 ##
 @@ -102,12 +104,20 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
+List dumpWithClause = Arrays.asList(
+"'hive.repl.add.raw.reserved.namespace'='true'",
+"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + 
"'='"
++ replica.externalTableWarehouseRoot + "'",
+"'distcp.options.skipcrccheck'=''",
 
 Review comment:
   That was part of existing test, I have used the same set of config in dump 
time
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404755)
Time Spent: 4h 10m  (was: 4h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404790&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404790
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 15:26
Start Date: 17/Mar/20 15:26
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393762700
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   Because copy is happening during load.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404790)
Time Spent: 4h 20m  (was: 4h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404793
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 15:28
Start Date: 17/Mar/20 15:28
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393723769
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Yes, I was not planning to refactor all the places.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404793)
Time Spent: 4.5h  (was: 4h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404796&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404796
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 15:31
Start Date: 17/Mar/20 15:31
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393723769
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTableLevelReplicationScenarios.java
 ##
 @@ -918,7 +922,9 @@ public void 
testRenameTableScenariosWithReplaceExternalTable() throws Throwable
 String newPolicy = primaryDbName + ".'(in[0-9]+)|(out1500)|(in2)'";
 dumpWithClause = Arrays.asList(
 "'" + HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES.varname + 
"'='true'",
-"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'"
+"'" + HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES.varname + 
"'='false'",
 
 Review comment:
   Missed it here, will fix
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404796)
Time Spent: 4h 40m  (was: 4.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=404945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-404945
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 17/Mar/20 20:29
Start Date: 17/Mar/20 20:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r393762700
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -436,16 +438,29 @@ public void externalTableIncrementalReplication() throws 
Throwable {
 }
 
 List loadWithClause = externalTableBasePathWithClause();
-replica.load(replicatedDbName, primaryDbName, loadWithClause)
+replica.load(replicatedDbName, primaryDbName, withClause)
 .run("use " + replicatedDbName)
 .run("show tables like 't1'")
 .verifyResult("t1")
 .run("show partitions t1")
 .verifyResults(new String[] { "country=india", "country=us" })
 .run("select place from t1 order by place")
-.verifyResults(new String[] { "bangalore", "mumbai", "pune" })
+.verifyResults(new String[] {})
 
 Review comment:
   Because copy was happening during load.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 404945)
Time Spent: 4h 50m  (was: 4h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405378
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:13
Start Date: 18/Mar/20 12:13
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394299142
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -78,22 +81,26 @@
 import java.io.IOException;
 import java.io.Serializable;
 import java.nio.charset.StandardCharsets;
+import java.util.Iterator;
 import java.util.Set;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.Base64;
-import java.util.ArrayList;
+import java.util.LinkedList;
 import java.util.UUID;
+import java.util.ArrayList;
 import java.util.concurrent.TimeUnit;
 import static org.apache.hadoop.hive.ql.exec.repl.ReplExternalTables.Writer;
 
 public class ReplDumpTask extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
   private static final String dumpSchema = 
"dump_dir,last_repl_id#string,string";
   private static final String FUNCTION_METADATA_FILE_NAME = 
EximUtil.METADATA_NAME;
   private static final long SLEEP_TIME = 6;
   Set tablesForBootstrap = new HashSet<>();
+  private Path dumpAckFile;
 
 Review comment:
   This needn't be a member variable
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405378)
Time Spent: 5h  (was: 4h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405380
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:13
Start Date: 18/Mar/20 12:13
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394297682
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,17 +690,48 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
+}
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+ReplOperationCompleteAckWork replDumpCompleteAckWork = new 
ReplOperationCompleteAckWork(dumpAckFile);
+Task dumpCompleteAckWorkTask = 
TaskFactory.get(replDumpCompleteAckWork, conf);
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+if (!childTasks.isEmpty()) {
+  boolean ackTaskAdded = false;
+  if (taskTracker.canAddMoreTasks()) {
+childTasks.add(dumpCompleteAckWorkTask);
+ackTaskAdded = true;
+  }
+  if (hasMoreCopyWork() || !ackTaskAdded) {
+DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
 
 Review comment:
   Add the ack only after no more work is pending
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405380)
Time Spent: 5h  (was: 4h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405381
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:13
Start Date: 18/Mar/20 12:13
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394291913
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestScheduledReplicationScenarios.java
 ##
 @@ -151,6 +151,70 @@ public void testAcidTablesReplLoadBootstrapIncr() throws 
Throwable {
   .run("drop table t1");
 
 
+} finally {
+  primary.run("drop scheduled query s1");
+  replica.run("drop scheduled query s2");
+}
+  }
+
+  @Test
+  public void testExternalTablesReplLoadBootstrapIncr() throws Throwable {
+// Bootstrap
+primary.run("use " + primaryDbName)
+.run("create external table t1 (id int)")
+.run("insert into t1 values(1)")
+.run("insert into t1 values(2)");
+try (ScheduledQueryExecutionService schqS =
+ 
ScheduledQueryExecutionService.startScheduledQueryExecutorService(primary.hiveConf))
 {
+  int next = 0;
+  ReplDumpWork.injectNextDumpDirForTest(String.valueOf(next));
+  primary.run("create scheduled query s1 every 10 minutes as repl dump " + 
primaryDbName);
+  primary.run("alter scheduled query s1 execute");
+  Thread.sleep(2);
 
 Review comment:
   You might need to increase this time. Dump is not getting completed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405381)
Time Spent: 5h  (was: 4h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405379
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:13
Start Date: 18/Mar/20 12:13
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394293829
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +129,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
-  if (shouldDump(previousHiveDumpPath)) {
-Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-// Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-ReplChangeManager.getInstance(conf);
-Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-Long lastReplId;
-if (previousHiveDumpPath == null) {
-  lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
+  } else {
+Hive hiveDb = getHive();
+Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
+
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
+.getBytes(StandardCharsets.UTF_8.name(;
+Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
+Path previousHiveDumpPath =
+previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
+//If no previous dump is present or previous dump was loaded, proceed
+if (shouldDump(previousHiveDumpPath)) {
+  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
+  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
+  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
+  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
+  ReplChangeManager.getInstance(conf);
+  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
+  Long lastReplId;
+  if (previousHiveDumpPath == null) {
+lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  } else {
+
work.setEventFrom(getEventFromPreviousDumpMetadata(previousHiveDumpPath));
+lastReplId = incrementalDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  }
+  
prepareReturnValues(Arrays.asList(currentDumpPath.toUri().toString(), 
String.valueOf(lastReplId)));
+  deleteAllPreviousDumpMeta(dumpRoot, currentDumpPath);
 
 Review comment:
   This should be done at the end
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405379)
Time Spent: 5h  (was: 4h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h
>  Remaining Estimate: 0h

[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405384&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405384
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:25
Start Date: 18/Mar/20 12:25
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394307271
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,17 +690,48 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
+}
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+ReplOperationCompleteAckWork replDumpCompleteAckWork = new 
ReplOperationCompleteAckWork(dumpAckFile);
+Task dumpCompleteAckWorkTask = 
TaskFactory.get(replDumpCompleteAckWork, conf);
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+if (!childTasks.isEmpty()) {
+  boolean ackTaskAdded = false;
+  if (taskTracker.canAddMoreTasks()) {
+childTasks.add(dumpCompleteAckWorkTask);
+ackTaskAdded = true;
+  }
+  if (hasMoreCopyWork() || !ackTaskAdded) {
+DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
 
 Review comment:
   Ack is being added at the end in these cases as well
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405384)
Time Spent: 5h 10m  (was: 5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405385&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405385
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:26
Start Date: 18/Mar/20 12:26
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394307963
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -78,22 +81,26 @@
 import java.io.IOException;
 import java.io.Serializable;
 import java.nio.charset.StandardCharsets;
+import java.util.Iterator;
 import java.util.Set;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.Base64;
-import java.util.ArrayList;
+import java.util.LinkedList;
 import java.util.UUID;
+import java.util.ArrayList;
 import java.util.concurrent.TimeUnit;
 import static org.apache.hadoop.hive.ql.exec.repl.ReplExternalTables.Writer;
 
 public class ReplDumpTask extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
   private static final String dumpSchema = 
"dump_dir,last_repl_id#string,string";
   private static final String FUNCTION_METADATA_FILE_NAME = 
EximUtil.METADATA_NAME;
   private static final long SLEEP_TIME = 6;
   Set tablesForBootstrap = new HashSet<>();
+  private Path dumpAckFile;
 
 Review comment:
   We don't store the dumpath anywhere. Across the multiple call this info will 
be lost. I guess, ReplDumpWork is a better place for this info.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405385)
Time Spent: 5h 20m  (was: 5h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405386
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 12:27
Start Date: 18/Mar/20 12:27
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394308202
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestScheduledReplicationScenarios.java
 ##
 @@ -151,6 +151,70 @@ public void testAcidTablesReplLoadBootstrapIncr() throws 
Throwable {
   .run("drop table t1");
 
 
+} finally {
+  primary.run("drop scheduled query s1");
+  replica.run("drop scheduled query s2");
+}
+  }
+
+  @Test
+  public void testExternalTablesReplLoadBootstrapIncr() throws Throwable {
+// Bootstrap
+primary.run("use " + primaryDbName)
+.run("create external table t1 (id int)")
+.run("insert into t1 values(1)")
+.run("insert into t1 values(2)");
+try (ScheduledQueryExecutionService schqS =
+ 
ScheduledQueryExecutionService.startScheduledQueryExecutorService(primary.hiveConf))
 {
+  int next = 0;
+  ReplDumpWork.injectNextDumpDirForTest(String.valueOf(next));
+  primary.run("create scheduled query s1 every 10 minutes as repl dump " + 
primaryDbName);
+  primary.run("alter scheduled query s1 execute");
+  Thread.sleep(2);
 
 Review comment:
   Ok
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405386)
Time Spent: 5.5h  (was: 5h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.2.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405536
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 15:59
Start Date: 18/Mar/20 15:59
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394459376
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,20 +193,29 @@ private Long getEventFromPreviousDumpMetadata(Path 
previousDumpPath) throws Sema
   }
 
   private Path getPreviousDumpMetadataPath(Path dumpRoot) throws IOException {
+FileStatus latestUpdatedStatus = null;
 FileSystem fs = dumpRoot.getFileSystem(conf);
 if (fs.exists(dumpRoot)) {
   FileStatus[] statuses = fs.listStatus(dumpRoot);
   if (statuses.length > 0)  {
 
 Review comment:
   We don't need this check
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405536)
Time Spent: 5h 40m  (was: 5.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405541
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 16:06
Start Date: 18/Mar/20 16:06
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394464144
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,20 +193,29 @@ private Long getEventFromPreviousDumpMetadata(Path 
previousDumpPath) throws Sema
   }
 
   private Path getPreviousDumpMetadataPath(Path dumpRoot) throws IOException {
+FileStatus latestUpdatedStatus = null;
 FileSystem fs = dumpRoot.getFileSystem(conf);
 if (fs.exists(dumpRoot)) {
   FileStatus[] statuses = fs.listStatus(dumpRoot);
   if (statuses.length > 0)  {
 
 Review comment:
   Is fs.listStatus() guaranteed to be not null even when the the path does not 
have any folder/files?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405541)
Time Spent: 5h 50m  (was: 5h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=405578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-405578
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 18/Mar/20 16:47
Start Date: 18/Mar/20 16:47
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #951: HIVE-22997 : Copy 
external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394492450
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,20 +193,29 @@ private Long getEventFromPreviousDumpMetadata(Path 
previousDumpPath) throws Sema
   }
 
   private Path getPreviousDumpMetadataPath(Path dumpRoot) throws IOException {
+FileStatus latestUpdatedStatus = null;
 FileSystem fs = dumpRoot.getFileSystem(conf);
 if (fs.exists(dumpRoot)) {
   FileStatus[] statuses = fs.listStatus(dumpRoot);
   if (statuses.length > 0)  {
 
 Review comment:
   It will return empty array
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 405578)
Time Spent: 6h  (was: 5h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=406018&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406018
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 19/Mar/20 09:17
Start Date: 19/Mar/20 09:17
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394885186
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -193,20 +193,29 @@ private Long getEventFromPreviousDumpMetadata(Path 
previousDumpPath) throws Sema
   }
 
   private Path getPreviousDumpMetadataPath(Path dumpRoot) throws IOException {
+FileStatus latestUpdatedStatus = null;
 FileSystem fs = dumpRoot.getFileSystem(conf);
 if (fs.exists(dumpRoot)) {
   FileStatus[] statuses = fs.listStatus(dumpRoot);
   if (statuses.length > 0)  {
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 406018)
Time Spent: 6h 10m  (was: 6h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=406019&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-406019
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 19/Mar/20 09:17
Start Date: 19/Mar/20 09:17
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r394307271
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,17 +690,48 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
+}
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+ReplOperationCompleteAckWork replDumpCompleteAckWork = new 
ReplOperationCompleteAckWork(dumpAckFile);
+Task dumpCompleteAckWorkTask = 
TaskFactory.get(replDumpCompleteAckWork, conf);
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+if (!childTasks.isEmpty()) {
+  boolean ackTaskAdded = false;
+  if (taskTracker.canAddMoreTasks()) {
+childTasks.add(dumpCompleteAckWorkTask);
+ackTaskAdded = true;
+  }
+  if (hasMoreCopyWork() || !ackTaskAdded) {
+DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
 
 Review comment:
   Ack is being added at the end in these tasks as well
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 406019)
Time Spent: 6h 20m  (was: 6h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407009&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407009
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395577292
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplOperationCompleteAckWork.java
 ##
 @@ -18,27 +18,28 @@
 
 package org.apache.hadoop.hive.ql.exec.repl;
 
+import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
 import java.io.Serializable;
 
 /**
- * ReplLoadCompleteAckWork.
- * FS based Acknowledgement for repl load complete
+ * ReplOperationCompleteAckWork.
+ * FS based acknowledgement on repl dump and repl load completion.
  *
  */
-@Explain(displayName = "Repl Load Complete Ack", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
-public class ReplLoadCompleteAckWork implements Serializable {
+@Explain(displayName = "Repl Operation Complete Ack", explainLevels = { 
Level.USER, Level.DEFAULT, Level.EXTENDED })
 
 Review comment:
   Just have the class as AckWork.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407009)
Time Spent: 7h 10m  (was: 7h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407007
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395575495
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
+if (!childTasks.isEmpty()) {
 
 Review comment:
   better to not do the negate check and inverse the if else block by removing 
the !
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407007)
Time Spent: 7h  (was: 6h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407012
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395584857
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
 
 Review comment:
   why is this static method here, there is nothing accessed from there why not 
just in the current class. is some encapsulation broken here ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407012)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407004&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407004
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395570139
 
 

 ##
 File path: 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java
 ##
 @@ -30,7 +30,7 @@
   REPL_TXN(15),
   REPL_INCREMENTAL_LOAD(16),
   SCHEDULED_QUERY_MAINT(17),
-  REPL_LOAD_COMPLETE_ACK(18);
+  REPL_OPERATION_COMPLETE_ACK(18);
 
   private final int value;
 
 
 Review comment:
   unused imports like HashMap, Util
   remove the private qualifier from constructor.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407004)
Time Spent: 6.5h  (was: 6h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407013&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407013
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395582361
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +128,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
-  if (shouldDump(previousHiveDumpPath)) {
-Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-// Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-ReplChangeManager.getInstance(conf);
-Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-Long lastReplId;
-if (previousHiveDumpPath == null) {
-  lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
+  } else {
+Hive hiveDb = getHive();
+Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
+
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
+.getBytes(StandardCharsets.UTF_8.name(;
+Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
+Path previousHiveDumpPath =
+previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
+//If no previous dump is present or previous dump was loaded, proceed
+if (shouldDump(previousHiveDumpPath)) {
+  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
+  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
+  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
+  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
+  ReplChangeManager.getInstance(conf);
+  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
+  Long lastReplId;
+  if (previousHiveDumpPath == null) {
+lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  } else {
+
work.setEventFrom(getEventFromPreviousDumpMetadata(previousHiveDumpPath));
+lastReplId = incrementalDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  }
+  
work.setResultValues(Arrays.asList(currentDumpPath.toUri().toString(), 
String.valueOf(lastReplId)));
+  deleteAllPreviousDumpMeta(dumpRoot, currentDumpPath);
+  work.setDumpAckFile(new Path(hiveDumpRoot, 
ReplUtils.DUMP_ACKNOWLEDGEMENT));
+  intitiateDataCopyTasks();
 } else {
-  
work.setEventFrom(getEventFromPreviousDumpMetadata(previousHiveDumpPath));
-  lastReplId = incrementalDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  LOG.warn("Previous Dump is not yet loaded");
 }
-prepareReturnValues(Arrays.asList(currentDumpPath.toUri().toString(), 
String.valueOf(lastReplId)));
-writeDumpCompleteAck(hiveDumpRoot);
-deleteAllPreviousDumpMeta(dumpRoot, currentDumpPath);
-  } else {
-LOG.warn("Previous Dump is not yet loaded");
 
 Review comment:
   this is not warn its info 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407013)

> Copy external table to target during Repl Dump operation
> 
>
> Key:

[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407008&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407008
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395570710
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -78,18 +81,21 @@
 import java.io.IOException;
 import java.io.Serializable;
 import java.nio.charset.StandardCharsets;
+import java.util.Iterator;
 import java.util.Set;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.Base64;
-import java.util.ArrayList;
+import java.util.LinkedList;
 import java.util.UUID;
+import java.util.ArrayList;
 import java.util.concurrent.TimeUnit;
 import static org.apache.hadoop.hive.ql.exec.repl.ReplExternalTables.Writer;
 
 public class ReplDumpTask extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
 
 Review comment:
   tablesForBootstrap can be private 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407008)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407006&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407006
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395573044
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +128,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
-  if (shouldDump(previousHiveDumpPath)) {
-Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-// Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-ReplChangeManager.getInstance(conf);
-Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-Long lastReplId;
-if (previousHiveDumpPath == null) {
-  lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
 
 Review comment:
   should be initiateDataCopyTasks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407006)
Time Spent: 6h 50m  (was: 6h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407015&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407015
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395586086
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
+if (!childTasks.isEmpty()) {
+  DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+} else {
+  prepareReturnValues(work.getResultValues());
+  childTasks.add(TaskFactory.get(new 
ReplOperationCompleteAckWork(work.getDumpAckFile()), conf));
 }
 
 Review comment:
   why is ack work only if the child task is empty, there is possibility of 
adding the ack work at the end of all the dag, if the dag can contain more 
nodes. may be need to move it out of else clause and do it based on 
tracker.canAddMoreTasks or something?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407015)
Time Spent: 7h 40m  (was: 7.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407011&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407011
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395581381
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +128,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
 
 Review comment:
   why not move the above statement in the function before  this 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407011)
Time Spent: 7h 20m  (was: 7h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407014&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407014
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395584502
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -404,7 +417,8 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
   }
 
   Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true);
-
+  List extTableLocations = new LinkedList<>();
+  List replPathMappings = new ArrayList<>();
 
 Review comment:
   why is this repl path mapping ? isnt this managed table data only ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407014)
Time Spent: 7.5h  (was: 7h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407010&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407010
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395579429
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
 
 Review comment:
   why not just have just two statements like, with some return values added to 
child tasks.
   
   work.externalTableCopyTasks(taskTracker, conf);
   work.managedTableCopyTasks()
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407010)
Time Spent: 7h 20m  (was: 7h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407005
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 11:50
Start Date: 20/Mar/20 11:50
Worklog Time Spent: 10m 
  Work Description: anishek commented on pull request #951: HIVE-22997 : 
Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395573595
 
 

 ##
 File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java
 ##
 @@ -127,13 +128,15 @@ public void 
removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails() throw
   private int tableDumpCount = 0;
 
   @Override
-  void dumpTable(String dbName, String tblName, String validTxnList, Path 
dbRoot, Path replDataDir,
- long lastReplId, Hive hiveDb, HiveWrapper.Tuple 
tuple)
+  List dumpTable(String dbName, String tblName, 
String validTxnList, Path dbRoot,
+   Path replDataDir, long 
lastReplId, Hive hiveDb,
+   HiveWrapper.Tuple tuple)
   throws Exception {
 tableDumpCount++;
 if (tableDumpCount > 1) {
   throw new TestException();
 }
+return Collections.EMPTY_LIST;
 
 Review comment:
   better to use Collections.emptyList() here and other places, the generic 
cast would  be provided by this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407005)
Time Spent: 6h 40m  (was: 6.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407024&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407024
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 12:27
Start Date: 20/Mar/20 12:27
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395603150
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
+if (!childTasks.isEmpty()) {
+  DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+} else {
+  prepareReturnValues(work.getResultValues());
+  childTasks.add(TaskFactory.get(new 
ReplOperationCompleteAckWork(work.getDumpAckFile()), conf));
 }
 
 Review comment:
   Yes, in my previous patch, I was adding the ack work as child task whenever 
it was possible by making use of tracker.canAddMoreTasks . Then the requirement 
came to run this prepareReturnValues(); as the last operation just before ack 
task.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407024)
Time Spent: 7h 50m  (was: 7h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407025&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407025
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 12:30
Start Date: 20/Mar/20 12:30
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395603150
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
+if (!childTasks.isEmpty()) {
+  DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+} else {
+  prepareReturnValues(work.getResultValues());
+  childTasks.add(TaskFactory.get(new 
ReplOperationCompleteAckWork(work.getDumpAckFile()), conf));
 }
 
 Review comment:
   Yes, in my previous patch, I was adding the ack work as child task whenever 
it was possible by making use of tracker.canAddMoreTasks . Then the requirement 
came to run this prepareReturnValues(); as the last operation just before ack 
task.
   Until prepareReturnValues() can be run as separate task, we can't add it as 
a future operation. There is no generic task we have for ad-hoc operation like 
this and I thought creating a new one would be overkill. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407025)
Time Spent: 8h  (was: 7h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407026&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407026
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 20/Mar/20 12:34
Start Date: 20/Mar/20 12:34
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395606610
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
 
 Review comment:
   ReplPathMapping.tasks() is sort of util method. Ideally there should be a 
builder class like we have for dirCopyTasks: ExternalTableCopyTaskBuilder.
   In this case full fledged builder wasn't required. I can add a new Builder 
class for the same.
   May be, we can shift it to ReplCopyTask as a static method as ultimately it 
is giving back ReplCopyTasks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407026)
Time Spent: 8h 10m  (was: 8h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.2.patch, HIVE-22997.4.patch, 
> HIVE-22997.5.patch, HIVE-22997.6.patch, HIVE-22997.7.patch, 
> HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407354&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407354
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395952745
 
 

 ##
 File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java
 ##
 @@ -127,13 +128,15 @@ public void 
removeDBPropertyToPreventRenameWhenBootstrapDumpOfTableFails() throw
   private int tableDumpCount = 0;
 
   @Override
-  void dumpTable(String dbName, String tblName, String validTxnList, Path 
dbRoot, Path replDataDir,
- long lastReplId, Hive hiveDb, HiveWrapper.Tuple 
tuple)
+  List dumpTable(String dbName, String tblName, 
String validTxnList, Path dbRoot,
+   Path replDataDir, long 
lastReplId, Hive hiveDb,
+   HiveWrapper.Tuple tuple)
   throws Exception {
 tableDumpCount++;
 if (tableDumpCount > 1) {
   throw new TestException();
 }
+return Collections.EMPTY_LIST;
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407354)
Time Spent: 8h 20m  (was: 8h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407355&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407355
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395952861
 
 

 ##
 File path: 
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java
 ##
 @@ -30,7 +30,7 @@
   REPL_TXN(15),
   REPL_INCREMENTAL_LOAD(16),
   SCHEDULED_QUERY_MAINT(17),
-  REPL_LOAD_COMPLETE_ACK(18);
+  REPL_OPERATION_COMPLETE_ACK(18);
 
   private final int value;
 
 
 Review comment:
   Removed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407355)
Time Spent: 8.5h  (was: 8h 20m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407356
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395953009
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -78,18 +81,21 @@
 import java.io.IOException;
 import java.io.Serializable;
 import java.nio.charset.StandardCharsets;
+import java.util.Iterator;
 import java.util.Set;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.Base64;
-import java.util.ArrayList;
+import java.util.LinkedList;
 import java.util.UUID;
+import java.util.ArrayList;
 import java.util.concurrent.TimeUnit;
 import static org.apache.hadoop.hive.ql.exec.repl.ReplExternalTables.Writer;
 
 public class ReplDumpTask extends Task implements Serializable {
+  private static final long serialVersionUID = 1L;
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407356)
Time Spent: 8h 40m  (was: 8.5h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407357&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407357
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395952774
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplOperationCompleteAckWork.java
 ##
 @@ -18,27 +18,28 @@
 
 package org.apache.hadoop.hive.ql.exec.repl;
 
+import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.hive.ql.plan.Explain;
 import org.apache.hadoop.hive.ql.plan.Explain.Level;
 
 import java.io.Serializable;
 
 /**
- * ReplLoadCompleteAckWork.
- * FS based Acknowledgement for repl load complete
+ * ReplOperationCompleteAckWork.
+ * FS based acknowledgement on repl dump and repl load completion.
  *
  */
-@Explain(displayName = "Repl Load Complete Ack", explainLevels = { Level.USER, 
Level.DEFAULT, Level.EXTENDED })
-public class ReplLoadCompleteAckWork implements Serializable {
+@Explain(displayName = "Repl Operation Complete Ack", explainLevels = { 
Level.USER, Level.DEFAULT, Level.EXTENDED })
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407357)
Time Spent: 8h 50m  (was: 8h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407359
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395952920
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +128,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
-  if (shouldDump(previousHiveDumpPath)) {
-Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
-Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
-DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
-// Initialize ReplChangeManager instance since we will require it to 
encode file URI.
-ReplChangeManager.getInstance(conf);
-Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
-Long lastReplId;
-if (previousHiveDumpPath == null) {
-  lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  if (work.dirCopyIteratorInitialized() || 
work.replPathIteratorInitialized()) {
+intitiateDataCopyTasks();
+  } else {
+Hive hiveDb = getHive();
+Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
+
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
+.getBytes(StandardCharsets.UTF_8.name(;
+Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
+Path previousHiveDumpPath =
+previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
+//If no previous dump is present or previous dump was loaded, proceed
+if (shouldDump(previousHiveDumpPath)) {
+  Path currentDumpPath = new Path(dumpRoot, getNextDumpDir());
+  Path hiveDumpRoot = new Path(currentDumpPath, 
ReplUtils.REPL_HIVE_BASE_DIR);
+  DumpMetaData dmd = new DumpMetaData(hiveDumpRoot, conf);
+  // Initialize ReplChangeManager instance since we will require it to 
encode file URI.
+  ReplChangeManager.getInstance(conf);
+  Path cmRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLCMDIR));
+  Long lastReplId;
+  if (previousHiveDumpPath == null) {
+lastReplId = bootStrapDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  } else {
+
work.setEventFrom(getEventFromPreviousDumpMetadata(previousHiveDumpPath));
+lastReplId = incrementalDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  }
+  
work.setResultValues(Arrays.asList(currentDumpPath.toUri().toString(), 
String.valueOf(lastReplId)));
+  deleteAllPreviousDumpMeta(dumpRoot, currentDumpPath);
+  work.setDumpAckFile(new Path(hiveDumpRoot, 
ReplUtils.DUMP_ACKNOWLEDGEMENT));
+  intitiateDataCopyTasks();
 } else {
-  
work.setEventFrom(getEventFromPreviousDumpMetadata(previousHiveDumpPath));
-  lastReplId = incrementalDump(hiveDumpRoot, dmd, cmRoot, hiveDb);
+  LOG.warn("Previous Dump is not yet loaded");
 }
-prepareReturnValues(Arrays.asList(currentDumpPath.toUri().toString(), 
String.valueOf(lastReplId)));
-writeDumpCompleteAck(hiveDumpRoot);
-deleteAllPreviousDumpMeta(dumpRoot, currentDumpPath);
-  } else {
-LOG.warn("Previous Dump is not yet loaded");
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407359)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> 

[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407360&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407360
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395953040
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -662,15 +696,37 @@ void dumpTable(String dbName, String tblName, String 
validTxnList, Path dbRoot,
 replLogger.tableLog(tblName, tableSpec.tableHandle.getTableType());
 if (tableSpec.tableHandle.getTableType().equals(TableType.EXTERNAL_TABLE)
 || Utils.shouldDumpMetaDataOnly(conf)) {
-  return;
+  return Collections.EMPTY_LIST;
 }
-for (ReplPathMapping replPathMapping: replPathMappings) {
-  Task copyTask = ReplCopyTask.getLoadCopyTask(
-  tuple.replicationSpec, replPathMapping.getSrcPath(), 
replPathMapping.getTargetPath(), conf, false);
-  this.addDependentTask(copyTask);
-  LOG.info("Scheduled a repl copy task from [{}] to [{}]",
-  replPathMapping.getSrcPath(), replPathMapping.getTargetPath());
+return replPathMappings;
+  }
+
+  private void intitiateDataCopyTasks() throws SemanticException {
+Iterator extCopyWorkItr = 
work.getDirCopyIterator();
+List> childTasks = new ArrayList<>();
+int maxTasks = 
conf.getIntVar(HiveConf.ConfVars.REPL_APPROX_MAX_LOAD_TASKS);
+TaskTracker taskTracker = new TaskTracker(maxTasks);
+while (taskTracker.canAddMoreTasks() && hasMoreCopyWork()) {
+  if (work.replPathIteratorInitialized() && extCopyWorkItr.hasNext()) {
+childTasks.addAll(new ExternalTableCopyTaskBuilder(work, 
conf).tasks(taskTracker));
+  } else {
+childTasks.addAll(ReplPathMapping.tasks(work, taskTracker, conf));
+  }
+}
+if (!childTasks.isEmpty()) {
+  DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+} else {
+  prepareReturnValues(work.getResultValues());
+  childTasks.add(TaskFactory.get(new 
ReplOperationCompleteAckWork(work.getDumpAckFile()), conf));
 }
 
 Review comment:
   Have refactored a bit. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407360)
Time Spent: 9h  (was: 8h 50m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=407358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407358
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 21/Mar/20 02:29
Start Date: 21/Mar/20 02:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #951: HIVE-22997 
: Copy external table to target during Repl Dump operation
URL: https://github.com/apache/hive/pull/951#discussion_r395952912
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
 ##
 @@ -122,33 +128,38 @@ public String getName() {
   @Override
   public int execute() {
 try {
-  Hive hiveDb = getHive();
-  Path dumpRoot = new Path(conf.getVar(HiveConf.ConfVars.REPLDIR),
-  
Base64.getEncoder().encodeToString(work.dbNameOrPattern.toLowerCase()
-  .getBytes(StandardCharsets.UTF_8.name(;
-  Path previousDumpMetaPath = getPreviousDumpMetadataPath(dumpRoot);
-  Path previousHiveDumpPath =
-  previousDumpMetaPath != null ? new Path(previousDumpMetaPath, 
ReplUtils.REPL_HIVE_BASE_DIR) : null;
-  //If no previous dump is present or previous dump was loaded, proceed
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407358)
Time Spent: 8h 50m  (was: 8h 40m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.2.patch, 
> HIVE-22997.4.patch, HIVE-22997.5.patch, HIVE-22997.6.patch, 
> HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-06-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=442121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442121
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 06/Jun/20 00:24
Start Date: 06/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #951:
URL: https://github.com/apache/hive/pull/951#issuecomment-639913878


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442121)
Time Spent: 9h 20m  (was: 9h 10m)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.17.patch, 
> HIVE-22997.18.patch, HIVE-22997.19.patch, HIVE-22997.2.patch, 
> HIVE-22997.20.patch, HIVE-22997.21.patch, HIVE-22997.22.patch, 
> HIVE-22997.23.patch, HIVE-22997.24.patch, HIVE-22997.25.patch, 
> HIVE-22997.26.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22997) Copy external table to target during Repl Dump operation

2020-06-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22997?focusedWorklogId=442099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-442099
 ]

ASF GitHub Bot logged work on HIVE-22997:
-

Author: ASF GitHub Bot
Created on: 06/Jun/20 00:23
Start Date: 06/Jun/20 00:23
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #951:
URL: https://github.com/apache/hive/pull/951


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 442099)
Time Spent: 9h 10m  (was: 9h)

> Copy external table to target during Repl Dump operation
> 
>
> Key: HIVE-22997
> URL: https://issues.apache.org/jira/browse/HIVE-22997
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22997.03.patch, HIVE-22997.04.patch, 
> HIVE-22997.1.patch, HIVE-22997.10.patch, HIVE-22997.11.patch, 
> HIVE-22997.12.patch, HIVE-22997.13.patch, HIVE-22997.14.patch, 
> HIVE-22997.15.patch, HIVE-22997.16.patch, HIVE-22997.17.patch, 
> HIVE-22997.18.patch, HIVE-22997.19.patch, HIVE-22997.2.patch, 
> HIVE-22997.20.patch, HIVE-22997.21.patch, HIVE-22997.22.patch, 
> HIVE-22997.23.patch, HIVE-22997.24.patch, HIVE-22997.25.patch, 
> HIVE-22997.26.patch, HIVE-22997.4.patch, HIVE-22997.5.patch, 
> HIVE-22997.6.patch, HIVE-22997.7.patch, HIVE-22997.8.patch, HIVE-22997.9.patch
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)