atsaonerk opened a new pull request, #3644: URL: https://github.com/apache/hive/pull/3644
Currently partitions of table is dump in parallel manner. But if table is not partitioned, it is dumped serially. This change introduces parallelism at table level as well. A single thread pool which is currently being used for partition level is also used for table level. The table level dump task is added to same thread pool. The degree of parallelism depends upon config parameter REPL_PARTITIONS_DUMP_PARALLELISM whose defaul value is 100. The new ExportService is introduced with this change which would be responsible for exporting table and partitions during repl dump. The ExportService is initialized and configured with thread pools by HiveServer2 service. A new Hiveconfig variable ie "REPL_TABLE_DUMP_PARALLELISM is introduced to define the number of threads which would be created in thread pool. The ExportService which is created as singleton instance would be used by ReplDumpTask. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
