[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has abandoned this change. ( http://gerrit.cloudera.org:8080/7919 ) Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Abandoned superseded by http://gerrit.cloudera.org/c/7985 -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-Change-Number: 7919 Gerrit-PatchSet: 2 Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has submitted this change and it was merged. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Reviewed-on: http://gerrit.cloudera.org:8080/7985 Reviewed-by: Dan BurkertTested-by: Kudu Jenkins --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 62 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 10: Code-Review+2 (1 comment) Carrying over Mike's +2. Had to rebase due to merge conflict in test. http://gerrit.cloudera.org:8080/#/c/7985/8/src/kudu/integration-tests/cluster_itest_util.cc File src/kudu/integration-tests/cluster_itest_util.cc: Line 1019: if (error_code) *error_code = error.code(); > I'm not sure if you did this or I did this, but if error_code is passed in Done -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Mike Percy, Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#10). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/10 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#9). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/9 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 8: -Code-Review (1 comment) http://gerrit.cloudera.org:8080/#/c/7985/8/src/kudu/integration-tests/cluster_itest_util.cc File src/kudu/integration-tests/cluster_itest_util.cc: Line 1019: *error_code = error.code(); I'm not sure if you did this or I did this, but if error_code is passed in as nullptr this will result in a segfault, which is bad because we default it to nullptr in the header file. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 8: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 166: if (!reader) { > Hmm, I'm not seeing where the log_ field in TabletReplica is getting reset? Ah, you're right. http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 73: TAG_FLAG(tablet_copy_session_inject_latency_on_init_ms, hidden); > Done Unsafe implies hidden, so explicit hidden isn't actually needed. But not a big deal. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 8: Code-Review+1 Mike should review this. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 7: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#7). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/7 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 6: Verified+1 ignore bogus iwyu -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 5: Verified+1 ignoring iwyu failure; looks wrong to me. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#5). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 243 insertions(+), 63 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/5 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/integration-tests/tablet_copy-itest.cc File src/kudu/integration-tests/tablet_copy-itest.cc: Line 1674: ASSERT_LT(lat, kInjectedLatency); > Can you also try looping it 1000 times in TSAN mode with some stress thread done: http://dist-test.cloudera.org/job?job_id=dan.1504902130.6748 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/integration-tests/tablet_copy-itest.cc File src/kudu/integration-tests/tablet_copy-itest.cc: Line 1674: } > This check is pretty critical to ensuring no regressions. We could weaken Can you also try looping it 1000 times in TSAN mode with some stress threads? If that passes then I think it's OK. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/integration-tests/tablet_copy-itest.cc File src/kudu/integration-tests/tablet_copy-itest.cc: Line 1643: Status s = itest::BeginTabletCopySession(ts, tablet_id, "dummy-uuid", kTimeout, ); > Since the response isn't actually used, consider dropping it from BeginTabl Done Line 1674: ASSERT_LT(lat, kInjectedLatency); > This seems risky; a variable amount of latency could be added to a failure This check is pretty critical to ensuring no regressions. We could weaken it a bit, but it would correspondingly weaken the test. I looped this 100 times to make sure it's not flaky, and it passed 100%. http://dist-test.cloudera.org/job?job_id=dan.1504897900.15603 http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_service-test.cc File src/kudu/tserver/tablet_copy_service-test.cc: Line 265: } > If you expect some contention for a fixed amount of time, it might be usefu I think it's valuable to have this pinging the methods as fast as possible in order to suss out concurrency issues. http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_service.cc File src/kudu/tserver/tablet_copy_service.cc: PS3, Line 185: SessionEntry* session_entry = FindOrNull(sessions_, session_id); : // Ensure that another interleaved thread has not already removed the failed session. : if (session_entry && session == session_entry->session) { : sessions_.erase(session_id); : } > Can you add to the comment to explain the identity check? I imagine it's to Done Line 267: // The corresponding session is already in the process of initializing. > Can you use RPC_RETURN_NOT_OK here? Done http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 73: TAG_FLAG(tablet_copy_session_inject_latency_on_init_ms, unsafe); > Make it hidden. Done Line 74: TAG_FLAG(tablet_copy_session_inject_latency_on_init_ms, runtime); > I don't see the new test changing the value of this flag at runtime; can th Done PS3, Line 128: TRACE("Injecting $0ms of latency due to --tablet_copy_session_inject_latency_on_init_ms", : FLAGS_tablet_copy_session_inject_latency_on_init_ms); > Is this useful to the tests that usei t? I think this is a common patter, e.g. https://github.com/apache/kudu/blob/master/src/kudu/tablet/transactions/write_transaction.cc#L153 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#4). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 244 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/4 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/integration-tests/tablet_copy-itest.cc File src/kudu/integration-tests/tablet_copy-itest.cc: Line 1643: Status s = itest::BeginTabletCopySession(ts, tablet_id, "dummy-uuid", kTimeout, ); Since the response isn't actually used, consider dropping it from BeginTabletCopySession. It can always be added later. Line 1674: ASSERT_LT(lat, kInjectedLatency); This seems risky; a variable amount of latency could be added to a failure from load and scheduling. http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_service-test.cc File src/kudu/tserver/tablet_copy_service-test.cc: Line 265: } If you expect some contention for a fixed amount of time, it might be useful to add a Sleep here rather than making it a tight loop to make the test is a little more CPU friendly. http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_service.cc File src/kudu/tserver/tablet_copy_service.cc: PS3, Line 185: SessionEntry* session_entry = FindOrNull(sessions_, session_id); : // Ensure that another interleaved thread has not already removed the failed session. : if (session_entry && session == session_entry->session) { : sessions_.erase(session_id); : } Can you add to the comment to explain the identity check? I imagine it's to protect against another thread not only removing this session but also replacing it with another one. Line 267: // The corresponding session is already in the process of initializing. Can you use RPC_RETURN_NOT_OK here? http://gerrit.cloudera.org:8080/#/c/7985/3/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 73: TAG_FLAG(tablet_copy_session_inject_latency_on_init_ms, unsafe); Make it hidden. Line 74: TAG_FLAG(tablet_copy_session_inject_latency_on_init_ms, runtime); I don't see the new test changing the value of this flag at runtime; can this be omitted? PS3, Line 128: TRACE("Injecting $0ms of latency due to --tablet_copy_session_inject_latency_on_init_ms", : FLAGS_tablet_copy_session_inject_latency_on_init_ms); Is this useful to the tests that usei t? -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#3). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 249 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/3 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_service.cc File src/kudu/tserver/tablet_copy_service.cc: Line 171: RPC_RETURN_NOT_OK(session->Init(), > It would be nice to remember if we were the thread to insert the session in Done Line 249: RPC_RETURN_NOT_OK(session->Init(), > Instead of waiting, we could simply reject the request if (!initted()) Done http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 166: shared_ptr reader = tablet_replica_->log()->reader(); > This is an existing bug, but we should take a ref to tablet_replica_->log() Hmm, I'm not seeing where the log_ field in TabletReplica is getting reset? The header docs for log() also don't mention this. Line 389: Status s; > no longer used Done http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.h File src/kudu/tserver/tablet_copy_source_session.h: PS1, Line 33: "kudu/gutil/integral_types.h" > Huh. I didn't know about this file. I've always used not sure, this is just a reorder. PS1, Line 192: scoped_refptr > nit: This can be made const (I know you haven't worked on this code before Done Line 197: KuduOnceDynamic init_once_; > Would be nice to add a comment noting that once the object is initialized, Added something to that effect. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7985 to look at the new patch set (#2). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 241 insertions(+), 63 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/2 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.h File src/kudu/tserver/tablet_copy_source_session.h: Line 194 It's nice how much simpler the lifecycle of TabletCopySourceSession is with this change. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_service.cc File src/kudu/tserver/tablet_copy_service.cc: Line 171: RPC_RETURN_NOT_OK(session->Init(), It would be nice to remember if we were the thread to insert the session into the sessions_ map and, if not, reject the request if (!initted()) to free up handler thread capacity. Line 249: RPC_RETURN_NOT_OK(session->Init(), Instead of waiting, we could simply reject the request if (!initted()) http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: Line 166: shared_ptr reader = tablet_replica_->log()->reader(); This is an existing bug, but we should take a ref to tablet_replica_->log() and ensure it's non-false before calling reader() on it, since TabletReplica::Stop() will reset it and this could cause a segfault. Line 389: Status s; no longer used http://gerrit.cloudera.org:8080/#/c/7985/1/src/kudu/tserver/tablet_copy_source_session.h File src/kudu/tserver/tablet_copy_source_session.h: PS1, Line 33: "kudu/gutil/integral_types.h" Huh. I didn't know about this file. I've always used PS1, Line 192: scoped_refptr nit: This can be made const (I know you haven't worked on this code before and didn't add this member) Line 197: KuduOnceDynamic init_once_; Would be nice to add a comment noting that once the object is initialized, all fields are thereafter const. -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Dan Burkert has uploaded a new change for review. http://gerrit.cloudera.org:8080/7985 Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch replaces the interior mutex in TabletCopySourceSession with a dynamic once, so that initialization can happen concurrently. This allows initialization to happen outside of the critical section in the tablet copy service. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 205 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/7985/1 -- To view, visit http://gerrit.cloudera.org:8080/7985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If8bd295a59ade8c89fdf1853dd64c4bceca8da91 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan BurkertGerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 2: > > the remote can time out and retry based on the leader telling > > it to retry, and those requests can be stuck in the server's > > RPC queue. > > Hmm, and we don't want to pay the cost of instantiating any > sessions beyond the first one? Okay, I guess I can buy that. > > I was asking because if we didn't mind creating the occasional > extra session or two (and just let them get cleaned up via > "expiration"), we could use an incrementing integer as the session > key, which would guarantee no session reuse at all, thus avoiding > the thorny synchronization around session creation. > > What do you think about using a per-session std::once to guarantee > Init is called? The synchronization would be something like this: On the Init() issue, I think we could make it guaranteed by simply giving the session itself a factory method that initializes it before returning it. However, I'm concerned about being able to get into a state where the server is doing a lot of wasted work initializing sessions, and the client is busy retrying and creating lots of extra sessions, resulting in a kind of livelock situation. I think the current implementation is simple enough, with the right abstractions, to merit using in order to avoid those kinds of nasty issues. In the current implementation, the client can start a session, and the RPC can time out, but the session will eventually be created. The next time the client retries, if the session is already being created, the client will be told to try again later. Once the session is initialized, the next time the client asks for the session to be created, the initialized session will be returned to the client and the client can continue with tablet copy. -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 2: > It's possible because the remote can time out and retry based on > the leader telling it to retry, and those requests can be stuck in > the server's RPC queue. See KUDU-1436 for an early example of this. Hmm, and we don't want to pay the cost of instantiating any sessions beyond the first one? Okay, I guess I can buy that. I was asking because if we didn't mind creating the occasional extra session or two (and just let them get cleaned up via "expiration"), we could use an incrementing integer as the session key, which would guarantee no session reuse at all, thus avoiding the thorny synchronization around session creation. What do you think about using a per-session std::once to guarantee Init is called? The synchronization would be something like this: Take lock. Look for existing entry in session map. If not there, create a new session and it to the map. Release lock. Create a ScopedCleanup that removes the session with the lock held. RETURN_NOT_OK(Init) // via session->once, so it's a no-op if the session is already initted cleanup.cancel() This way we don't have to worry about lock interaction: the session lock and the once's spinlock are independent. -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 2: > > Having such a session id design is more about de-duplicating > > begin-session requests than reusing existing sessions. Since it's > > possible for multiple requests from the same remote to queue up, > > due to timing issues that are difficult to control, we don't want > > to end up creating many duplicate sessions for the same remote, > > each of which is holding open many files and using up memory > until > > the sessions time out. > > How is it possible for there to be multiple requests from the same > remote? TSTabletManager::RunTabletCopy fails every attempt beyond > the first one. So there should be just one TabletCopyClient for > every peer/tablet ID pair. Also, TabletCopyClient::Start sends the > "begin session" RPC synchronously and doesn't appear to retry, so I > don't see how a single tablet copy client could send that RPC > multiple times either. > > Or is this about the remote failing, restarting, retrying the copy, > and reusing its existing session? It's possible because the remote can time out and retry based on the leader telling it to retry, and those requests can be stuck in the server's RPC queue. See KUDU-1436 for an early example of this. -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 2: > Having such a session id design is more about de-duplicating > begin-session requests than reusing existing sessions. Since it's > possible for multiple requests from the same remote to queue up, > due to timing issues that are difficult to control, we don't want > to end up creating many duplicate sessions for the same remote, > each of which is holding open many files and using up memory until > the sessions time out. How is it possible for there to be multiple requests from the same remote? TSTabletManager::RunTabletCopy fails every attempt beyond the first one. So there should be just one TabletCopyClient for every peer/tablet ID pair. Also, TabletCopyClient::Start sends the "begin session" RPC synchronously and doesn't appear to retry, so I don't see how a single tablet copy client could send that RPC multiple times either. Or is this about the remote failing, restarting, retrying the copy, and reusing its existing session? -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Mike Percy has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 2: > I'd like to better understand the concurrency model here. > > The tablet copy service maintains a session map whose key is > requestor ID + tablet ID. The requestor ID is just the UUID of the > tserver sending the "Begin Session" request. > > Under what circumstances can there be a "collision"? Seems like in > order to begin a copy the requestor will have already "locked" the > tablet (via TSTabletManager::StartTabletStateTransitionUnlocked), > so it won't initiate a second copy if one is in flight. And other > "Begin Session" requests from other tservers will have different > requestor IDs, so they won't collide. > > Much of the complexity (and this fix) have to do with the desired > behavior in the event of session ID reuse, so I want to understand > the circumstances and/or motivation behind reuse first. Having such a session id design is more about de-duplicating begin-session requests than reusing existing sessions. Since it's possible for multiple requests from the same remote to queue up, due to timing issues that are difficult to control, we don't want to end up creating many duplicate sessions for the same remote, each of which is holding open many files and using up memory until the sessions time out. -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7919 to look at the new patch set (#2). Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch incorporates a lock table to hold reservations for sessions currently being initialized, instead of holding the session lock during session initialization. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service-test.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_service.h M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 11 files changed, 245 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/7919/2 -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Adar Dembo has posted comments on this change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. Patch Set 1: I'd like to better understand the concurrency model here. The tablet copy service maintains a session map whose key is requestor ID + tablet ID. The requestor ID is just the UUID of the tserver sending the "Begin Session" request. Under what circumstances can there be a "collision"? Seems like in order to begin a copy the requestor will have already "locked" the tablet (via TSTabletManager::StartTabletStateTransitionUnlocked), so it won't initiate a second copy if one is in flight. And other "Begin Session" requests from other tservers will have different requestor IDs, so they won't collide. Much of the complexity (and this fix) have to do with the desired behavior in the event of session ID reuse, so I want to understand the circumstances and/or motivation behind reuse first. -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: No
[kudu-CR] KUDU-2124. Don't hold session lock while initializing a TabletCopySession
Hello Dan Burkert, Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/7919 to review the following change. Change subject: KUDU-2124. Don't hold session lock while initializing a TabletCopySession .. KUDU-2124. Don't hold session lock while initializing a TabletCopySession This patch incorporates a lock table to hold reservations for sessions currently being initialized, instead of holding the session lock during session initialization. Implemented a regression test that injects latency into BeginTabletCopySession() calls. Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe --- M src/kudu/integration-tests/cluster_itest_util.cc M src/kudu/integration-tests/cluster_itest_util.h M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_service.h M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_server-test-base.h M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/tablet_server_test_util.cc M src/kudu/tserver/tablet_server_test_util.h 10 files changed, 231 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/7919/1 -- To view, visit http://gerrit.cloudera.org:8080/7919 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I05f5b3c6a012067c95bb54f040bee2bb19388bfe Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike PercyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert