Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13545
Change subject: IMPALA-8630: Include partition id when calculating consistent remote placement ...................................................................... IMPALA-8630: Include partition id when calculating consistent remote placement Consistent remote placement currently uses the relative filename within a partition for the consistent hash. If the relative filenames for different partitions have a simple naming scheme, then multiple partitions may have files of the same name. This is true for some tables written by Hive (e.g. in our minicluster the tpcds.store_sales has this problem). This can lead to unbalanced placement of remote ranges. This adds the partition_id in the hash, so files with the same name from different partitions will have different hashes. Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b --- M be/src/scheduling/scheduler.cc 1 file changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13545/1 -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>