[ https://issues.apache.org/jira/browse/KUDU-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke reassigned KUDU-2785: --------------------------------- Assignee: Grant Henke > Support more parallel scanners in the backup job > ------------------------------------------------ > > Key: KUDU-2785 > URL: https://issues.apache.org/jira/browse/KUDU-2785 > Project: Kudu > Issue Type: Improvement > Affects Versions: 1.9.0 > Reporter: Grant Henke > Assignee: Grant Henke > Priority: Major > Labels: backup > > Currently the KuduBackup job uses 1 scanner and therefore 1 Spark task per > Kudu partition. When KUDU-2670 is complete, we should consider and test > having more than one scanner per partition and instead configuring a target > data size for each scanner. That should result in faster and more > reliable/predictable backup jobs regardless of partition count. > It may however make restoring more difficult because it could cause > compactions. Restore side testing and improvements may also be required. > Improvements to the estimation for key range sizes may also need to be done, > so this should be well tested. -- This message was sent by Atlassian JIRA (v7.6.3#76005)