Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2979: Fix scheduling on remote hosts ......................................................................
Patch Set 24: (3 comments) Just passing through. http://gerrit.cloudera.org:8080/#/c/2200/24/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS24, Line 1089: std::mt19937 g(rand()) You'd probably want to use a std::random_device() to seed the generator here as rand() returns a deterministic value always; seeing as you aren't seeding with srand() first. We already do this here: https://github.com/cloudera/Impala/blob/cdh5-trunk/be/src/rpc/authentication.cc#L489-490 http://gerrit.cloudera.org:8080/#/c/2200/24/be/src/scheduling/simple-scheduler.h File be/src/scheduling/simple-scheduler.h: PS24, Line 192: int next_unused_backend_idx_; Shouldn't this be an atomicInt? If multiple queries are running in parallel and they all are scheduling at the same time, this could break the logic. PS24, Line 353: replica_preference: This value is used as a minimum memory distance for all : /// replicas. For example, by setting this to DISK_LOCAL, all : /// cached replicas will be treated as if they were not cached, : /// but local disk replicas. This can help prevent hot-spots by : /// spreading the assignments over more replicas. Allowed values : /// are CACHE_LOCAL, DISK_LOCAL and REMOTE. : /// : /// disable_cached_reads: Setting this value to true is equivalent to setting : /// replica_preference to DISK_LOCAL and takes precedence over : /// replica_preference. : /// : /// schedule_random_replica: When equivalent backends are found for a scan range (same : /// memory distance, same amount of assigned work), then : /// the first one will be picked deterministically. This aims : /// to make better use of OS buffer caches, but can lead to : /// performance bottlenecks on single hosts. Setting this : /// option to true will randomly change the order in which : /// equivalent replicas are picked for different plan nodes. : /// This helps to compute a more even assignment, on the : /// downside of increased memory usage for OS buffer caches. It be useful to mention the defaults of these query options here. -- To view, visit http://gerrit.cloudera.org:8080/2200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I044f83806fcde820fcb38047cf6b8e780d803858 Gerrit-PatchSet: 24 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: anujphadke <apha...@cloudera.com> Gerrit-HasComments: Yes