[jira] [Updated] (HIVE-23521) REPL: Optimise partition loading during bootstrap

2020-06-10 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23521:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fixed via HIVE-23520. Marking this as resolved.

> REPL: Optimise partition loading during bootstrap
> -
>
> Key: HIVE-23521
> URL: https://issues.apache.org/jira/browse/HIVE-23521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23521.1.patch, HIVE-23521.2.patch
>
>
> When bootstrapping with large "REPL dump" with ~10K partitions, it starts 
> executing "addPartition" in sequential manner and takes very long time as it 
> communicates with HMS/registers partition etc for every call.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L399]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L165]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L210]
> When bootstrap loading has to deal with DDL, it would be good to collate all 
> partitions in single call to HMS. This would help in reducing overall runtime.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23521) REPL: Optimise partition loading during bootstrap

2020-05-25 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23521:

Attachment: HIVE-23521.2.patch

> REPL: Optimise partition loading during bootstrap
> -
>
> Key: HIVE-23521
> URL: https://issues.apache.org/jira/browse/HIVE-23521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23521.1.patch, HIVE-23521.2.patch
>
>
> When bootstrapping with large "REPL dump" with ~10K partitions, it starts 
> executing "addPartition" in sequential manner and takes very long time as it 
> communicates with HMS/registers partition etc for every call.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L399]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L165]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L210]
> When bootstrap loading has to deal with DDL, it would be good to collate all 
> partitions in single call to HMS. This would help in reducing overall runtime.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23521) REPL: Optimise partition loading during bootstrap

2020-05-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23521:

Assignee: Rajesh Balamohan
  Status: Patch Available  (was: Open)

> REPL: Optimise partition loading during bootstrap
> -
>
> Key: HIVE-23521
> URL: https://issues.apache.org/jira/browse/HIVE-23521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23521.1.patch
>
>
> When bootstrapping with large "REPL dump" with ~10K partitions, it starts 
> executing "addPartition" in sequential manner and takes very long time as it 
> communicates with HMS/registers partition etc for every call.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L399]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L165]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L210]
> When bootstrap loading has to deal with DDL, it would be good to collate all 
> partitions in single call to HMS. This would help in reducing overall runtime.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23521) REPL: Optimise partition loading during bootstrap

2020-05-21 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23521:

Attachment: HIVE-23521.1.patch

> REPL: Optimise partition loading during bootstrap
> -
>
> Key: HIVE-23521
> URL: https://issues.apache.org/jira/browse/HIVE-23521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-23521.1.patch
>
>
> When bootstrapping with large "REPL dump" with ~10K partitions, it starts 
> executing "addPartition" in sequential manner and takes very long time as it 
> communicates with HMS/registers partition etc for every call.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L399]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L165]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L210]
> When bootstrap loading has to deal with DDL, it would be good to collate all 
> partitions in single call to HMS. This would help in reducing overall runtime.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)