[kudu-CR] [c++ client] re-acquire authn token if expired
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6648 to look at the new patch set (#11). Change subject: [c++ client] re-acquire authn token if expired .. [c++ client] re-acquire authn token if expired Updated C++ client to re-acquire authn token if server responds with FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation. If that happens while negotiating an connection while performing an RPC call, the connection being negotiated is closed and the client re-connects to the cluster to get new authn token. After successfully re-acquiring authn token the client retries the RPC call again. Added corresponding integration test to cover authn token re-acquisition for various scenarios. Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 --- M docs/known_issues.adoc M docs/security.adoc M src/kudu/client/batcher.cc M src/kudu/client/client-internal.cc M src/kudu/client/meta_cache.cc M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/authn_token_expire-itest.cc M src/kudu/rpc/client_negotiation.cc M src/kudu/rpc/client_negotiation.h M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/messenger.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/retriable_rpc.h M src/kudu/rpc/rpc.h M src/kudu/rpc/server_negotiation.cc 21 files changed, 553 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/11 -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Adar Dembo has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/6630/14/docs/design-docs/cfile.md File docs/design-docs/cfile.md: Line 27: Header > I will add that they are optional. I am not sure if the cflags should go in That's fair. http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile-test.cc File src/kudu/cfile/cfile-test.cc: Line 336: data.mutable_data()[corrupt_offset] = corrupt_value; Nit: could do data.data()[corrupt_offset] here I think (though it may mean making 'orig' const uint8_t), since the data isn't yet beint modified. -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 13 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Adar Dembo has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/6630/15//COMMIT_MSG Commit Message: Line 7: WIP: KUDU-463. Add checksumming to cfile Also, this is no longer a WIP, right? -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6514 to look at the new patch set (#16). Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs This rejects unauthenticated connections from publicly routable IPs, even if authentication and encryption are not configured. An unsafe flag 'allow_unauthenticated_public_connections' is provided to enable unauthenticated connections from publicly routable IPs. Even though it is highly recommended to disable it. Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e --- M src/kudu/rpc/negotiation-test.cc M src/kudu/rpc/server_negotiation.cc M src/kudu/rpc/server_negotiation.h M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/server/server_base.cc M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h M src/kudu/util/net/sockaddr.cc M src/kudu/util/net/sockaddr.h M src/kudu/util/net/socket.h 12 files changed, 243 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6514/16 -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP: release WritableLogSegment buffers when log is idle
Todd Lipcon has posted comments on this change. Change subject: WIP: release WritableLogSegment buffers when log is idle .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6611/2/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: PS2, Line 224: // TODO(todd): consider rolling to a non-preallocated segment if we are : // idle for a good period of time (eg a few minutes), so that startup time is : // improved and disk usage is minimized. > I forget, do all of you memstores get flushed in a time basis, no matter ho yes, I believe so. -- To view, visit http://gerrit.cloudera.org:8080/6611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I512c7f9264ac9490e6a57259e4d7b2608adc1ca7 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6514 to look at the new patch set (#17). Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs This rejects unauthenticated connections from publicly routable IPs, even if authentication and encryption are not configured. An unsafe flag 'allow_unauthenticated_public_connections' is provided to enable unauthenticated connections from publicly routable IPs. Even though it is highly recommended to disable it. Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e --- M src/kudu/rpc/negotiation-test.cc M src/kudu/rpc/server_negotiation.cc M src/kudu/rpc/server_negotiation.h M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/server/server_base.cc M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h M src/kudu/util/net/sockaddr.cc M src/kudu/util/net/sockaddr.h M src/kudu/util/net/socket.h 12 files changed, 243 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6514/17 -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] WIP [java] re-acquire authn token if expired
Alexey Serbin has posted comments on this change. Change subject: WIP [java] re-acquire authn token if expired .. Patch Set 2: Verified+1 Unrelated MultiThreadedTabletTest/4.DeleteAndReinsert flake -- To view, visit http://gerrit.cloudera.org:8080/6816 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad5bef1d06708215839037741cd3bc213e130af2 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] spark: add support for fault tolerant scanner
Dan Burkert has posted comments on this change. Change subject: spark: add support for fault tolerant scanner .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6782/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: Line 50: val FAULT_TOLERANT_SCANNER = "kudu.fault.tolerant.scan" > hmm should this be something like 'kudu.fault-tolerant-scan'? I typically looks like the normal thing to do with spark options is camel case, so this should be "kudu.faultToleranScan". exampls: https://spark.apache.org/docs/latest/monitoring.html -- To view, visit http://gerrit.cloudera.org:8080/6782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6514 to look at the new patch set (#18). Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs This rejects unauthenticated connections from publicly routable IPs, even if authentication and encryption are not configured. It also provides two flags: one adavance flag 'trusted_subnets' to whitelist trusted subnets. Connections from trustred subnets will be allowed even if unauthenticaed; another unsafe flag 'allow_unauthenticated_public_connections' is provided to enable unauthenticated connections from publicly routable IPs. Even though it is highly recommended to disable it. Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e --- M src/kudu/rpc/negotiation-test.cc M src/kudu/rpc/server_negotiation.cc M src/kudu/rpc/server_negotiation.h M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/server/server_base.cc M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h M src/kudu/util/net/sockaddr.cc M src/kudu/util/net/sockaddr.h M src/kudu/util/net/socket.h 12 files changed, 246 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6514/18 -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 18 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hao Hao has posted comments on this change. Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. Patch Set 18: (8 comments) http://gerrit.cloudera.org:8080/#/c/6514/17/src/kudu/rpc/server_negotiation.cc File src/kudu/rpc/server_negotiation.cc: Line 873: for (const auto const& t : strings::Split(FLAGS_trusted_subnets, ",", strings::SkipEmpty())) { > warning: the variable '__end' is copy-constructed from a const reference bu Done http://gerrit.cloudera.org:8080/#/c/6514/17/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 69: > warning: the variable '__end' is copy-constructed from a const reference bu Done http://gerrit.cloudera.org:8080/#/c/6514/17/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: Line 230: Sockaddr addr(*reinterpret_cast(ifa->ifa_addr)); > warning: C-style casts are discouraged; use reinterpret_cast [google-readab Done Line 231: Sockaddr netmask(*reinterpret_cast(ifa->ifa_netmask)); > warning: C-style casts are discouraged; use reinterpret_cast [google-readab Done http://gerrit.cloudera.org:8080/#/c/6514/17/src/kudu/util/net/net_util.h File src/kudu/util/net/net_util.h: Line 114: Status GetFQDN(std::string* hostname); > warning: function 'kudu::GetFQDN' has a definition with different parameter Done http://gerrit.cloudera.org:8080/#/c/6514/13/src/kudu/util/net/sockaddr.cc File src/kudu/util/net/sockaddr.cc: Line 120: > warning: redundant boolean literal in conditional return statement [readabi Done Line 121: // Strip any whitespace from the cidr_addr. > warning: don't use else after return [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/6514/14/src/kudu/util/net/sockaddr.h File src/kudu/util/net/sockaddr.h: Line 71: bool WithInSubnet(const std::string& net_cidr_addr) const; > warning: function 'kudu::Sockaddr::WithInSubnet' has a definition with diff Done -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 18 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6514 to look at the new patch set (#19). Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs This rejects unauthenticated connections from publicly routable IPs, even if authentication and encryption are not configured. It also provides two flags: one adavance flag 'trusted_subnets' to whitelist trusted subnets. Connections from trustred subnets will be allowed even if unauthenticaed; another unsafe flag 'allow_unauthenticated_public_connections' is provided to enable unauthenticated connections from publicly routable IPs. Even though it is highly recommended to disable it. Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e --- M src/kudu/rpc/negotiation-test.cc M src/kudu/rpc/server_negotiation.cc M src/kudu/rpc/server_negotiation.h M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/server/server_base.cc M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h M src/kudu/util/net/sockaddr.cc M src/kudu/util/net/sockaddr.h M src/kudu/util/net/socket.h 12 files changed, 246 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/6514/19 -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 19 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Fix flaky test TestRecoverFromOpIdOverflow
David Ribeiro Alves has posted comments on this change. Change subject: Fix flaky test TestRecoverFromOpIdOverflow .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6808/2/src/kudu/integration-tests/ts_recovery-itest.cc File src/kudu/integration-tests/ts_recovery-itest.cc: PS2, Line 370: if it doesn't also contain the COMMIT message for 1.1 : // yet then it will trigger a CHECK complaining about non-sequential OpIds : // in the WAL at tablet bootstrap time. maybe this would be better stated as: "If the commit message for this NO_OP (OpId 1.1) was not written to disk yet, then it might get written _after_ the ops with the overflowed ids above, triggering a CHECK about non sequential OpIds. If we remove the first segment then the tablet will just assume that commit messages for all replicates in previous segments have already been written, thus avoiding the check. PS2, Line 373: string wal_dir = fs_manager->GetTabletWalDir(tablet_id); : vector wal_children; : ASSERT_OK(fs_manager->env()->GetChildren(wal_dir, &wal_children)); : // Skip '.', '..', and index files. : unordered_set wal_segments; : for (const auto& filename : wal_children) { : if (HasPrefixString(filename, FsManager::kWalFileNamePrefix)) { : wal_segments.insert(filename); : } : } I don't think you need this. you're only checking for GT num files and the ASSERT_OK on LOC 387 already returns an error if the file doesn't exist PS2, Line 383: Files in dir I guess my confusion stemmed from the fact that "Files in dir:" can be interpreted as "Files in this directory" which I did, and then I found odd that you were printing the files instead of the directory. Maybe rephrase a bit? -- To view, visit http://gerrit.cloudera.org:8080/6808 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib382819307da04bb76d68d2c015dc0edd9f60267 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] Fix flaky test TestRecoverFromOpIdOverflow
Mike Percy has uploaded a new patch set (#3). Change subject: Fix flaky test TestRecoverFromOpIdOverflow .. Fix flaky test TestRecoverFromOpIdOverflow This test is flaky because we race against the COMMIT message for the first NO_OP in the WAL being written. It is currently hard to know when the actual COMMIT message is written to the WAL so we use a workaround to delete the first log segment before restarting the EMC in this test. If the COMMIT doesn't get written in time then the tablet bootstrap process doesn't prune that entry at startup time and it ends up in the new log prior to the much higher-numbered log entries. The flaky test failure was due to an out of order log index being detected, and looked like the following error in the log: F0504 21:28:21.128690 13908 raft_consensus_state.cc:502] Check failed: _s.ok() Bad status: Corruption: New operation's index does not follow the previous op's index. Current: 2147483648.2147483648. Previous: 1.1 *** Check failure stack trace: *** @ 0x7fe0adef915d google::LogMessage::Fail() at ??:0 @ 0x7fe0adefb05d google::LogMessage::SendToLog() at ??:0 @ 0x7fe0adef8c99 google::LogMessage::Flush() at ??:0 @ 0x7fe0adefbaff google::LogMessageFatal::~LogMessageFatal() at ??:0 @ 0x7fe0b57a8018 kudu::consensus::PendingRounds::AdvanceCommittedIndex() at ??:0 @ 0x7fe0b57848a5 kudu::consensus::RaftConsensus::NotifyCommitIndex() at ??:0 @ 0x7fe0b5749ccf kudu::consensus::PeerMessageQueue::NotifyObserversOfCommitIndexChangeTask() at ??:0 @ 0x7fe0b575bdc1 kudu::internal::RunnableAdapter<>::Run() at ??:0 Change-Id: Ib382819307da04bb76d68d2c015dc0edd9f60267 --- M src/kudu/integration-tests/ts_recovery-itest.cc 1 file changed, 28 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/6808/3 -- To view, visit http://gerrit.cloudera.org:8080/6808 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib382819307da04bb76d68d2c015dc0edd9f60267 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy
[kudu-CR] Fix flaky test TestRecoverFromOpIdOverflow
Mike Percy has posted comments on this change. Change subject: Fix flaky test TestRecoverFromOpIdOverflow .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6808/2/src/kudu/integration-tests/ts_recovery-itest.cc File src/kudu/integration-tests/ts_recovery-itest.cc: PS2, Line 370: if it doesn't also contain the COMMIT message for 1.1 : // yet then it will trigger a CHECK complaining about non-sequential OpIds : // in the WAL at tablet bootstrap time. > maybe this would be better stated as: "If the commit message for this NO_OP Done PS2, Line 373: string wal_dir = fs_manager->GetTabletWalDir(tablet_id); : vector wal_children; : ASSERT_OK(fs_manager->env()->GetChildren(wal_dir, &wal_children)); : // Skip '.', '..', and index files. : unordered_set wal_segments; : for (const auto& filename : wal_children) { : if (HasPrefixString(filename, FsManager::kWalFileNamePrefix)) { : wal_segments.insert(filename); : } : } > I don't think you need this. you're only checking for GT num files and the This is collecting the wal_segments which is used on L387 so I need the loop. The assertion is making sure we have at least 2 wal segments so that we know it's safe to delete one. I'll change it to ASSERT_GE(..., 2) to make it clearer. PS2, Line 383: Files in dir > I guess my confusion stemmed from the fact that "Files in dir:" can be inte Done -- To view, visit http://gerrit.cloudera.org:8080/6808 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib382819307da04bb76d68d2c015dc0edd9f60267 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Mike Percy Gerrit-HasComments: Yes
[kudu-CR] maintenance manager: schedule work immediately when threads are free
David Ribeiro Alves has posted comments on this change. Change subject: maintenance_manager: schedule work immediately when threads are free .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/6815/1/src/kudu/util/maintenance_manager.cc File src/kudu/util/maintenance_manager.cc: PS1, Line 43: DEFINE_int32(maintenance_manager_num_threads, 1, : "Size of the maintenance manager thread pool. " : "For spinning disks, the number of threads should " : "not be above the number of devices."); should we change this default now? PS1, Line 231: if (!FLAGS_enable_maintenance_manager) { : KLOG_EVERY_N_SECS(INFO, 30) << "Maintenance manager is disabled. Doing nothing"; : return; : } not your fault but maybe move this above the inner while loop (before LOC 221)? PS1, Line 238: // If we found no work to do, then we should sleep before trying again to schedule. : // Otherwise, we can go right into trying to find the next op. : force_sleep_next_iter = (op == nullptr); since we're already checking whether op is null below maybe it would be clearer to have: if (!op) { VLOG_AND_TRACE("maintenance", 2) << LogPrefix() << "No maintenance operations look worth doing."; // If we found no work to do, then we should sleep before trying again to schedule. force_sleep_next_iter = true; continue; } -- To view, visit http://gerrit.cloudera.org:8080/6815 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I63c4b48f5f02f3a1d3a8964993e78037ce72b1da Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [build] disable tcmalloc for OS X builds
Alexey Serbin has uploaded a new change for review. http://gerrit.cloudera.org:8080/6821 Change subject: [build] disable tcmalloc for OS X builds .. [build] disable tcmalloc for OS X builds As a workaround for KUDU-1998, tcmalloc is disabled for OS X builds. Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 --- M CMakeLists.txt 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/6821/1 -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin
[kudu-CR] [build] disable tcmalloc for OS X builds
Adar Dembo has posted comments on this change. Change subject: [build] disable tcmalloc for OS X builds .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6821/1/CMakeLists.txt File CMakeLists.txt: Line 892: ## As a workaround for KUDU-1998, tcmalloc is disabled in OS X builds. That's quite the hammer. Can you sync with Todd about this? He may already have a patch in the works to fix this more delicately. If not, I think he can give you some pointers on other approaches. -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [build] disable tcmalloc for OS X builds
Will Berkeley has posted comments on this change. Change subject: [build] disable tcmalloc for OS X builds .. Patch Set 1: Code-Review+1 Checked on my OS X machine that this fixes the problem. -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] [build] disable tcmalloc for OS X builds
Adar Dembo has posted comments on this change. Change subject: [build] disable tcmalloc for OS X builds .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6821/1/CMakeLists.txt File CMakeLists.txt: Line 892: ## As a workaround for KUDU-1998, tcmalloc is disabled in OS X builds. > That's quite the hammer. Nevermind, Todd doesn't have anything in the works, and a potential switch to jemalloc (possibly the best approach) isn't happening right now. -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] spark: add support for fault tolerant scanner
Hao Hao has posted comments on this change. Change subject: spark: add support for fault tolerant scanner .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/6782/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: Line 50: val FAULT_TOLERANT_SCANNER = "kudu.fault.tolerant.scan" > looks like the normal thing to do with spark options is camel case, so this Done Line 147: private val isFaultTolerant: Boolean = faultTolerantScanner; > This isn't strictly necessary, you can reference 'FaultTolerantScanner' dir Done -- To view, visit http://gerrit.cloudera.org:8080/6782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] spark: add support for fault tolerant scanner
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6782 to look at the new patch set (#6). Change subject: spark: add support for fault tolerant scanner .. spark: add support for fault tolerant scanner This adds support to use fault tolerant scanner for spark job. By default non fault tolerant scanner is used. To turn on fault tolerant scanner, use job config: 'kudu.faultTolerantScan'. Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 --- M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduRDDTest.scala 5 files changed, 32 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/6782/6 -- To view, visit http://gerrit.cloudera.org:8080/6782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] spark: add support for fault tolerant scanner
Hao Hao has posted comments on this change. Change subject: spark: add support for fault tolerant scanner .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6782/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: Line 50: val FAULT_TOLERANT_SCANNER = "kudu.fault.tolerant.scan" > looks like the normal thing to do with spark options is camel case, so this Done -- To view, visit http://gerrit.cloudera.org:8080/6782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Todd Lipcon has posted comments on this change. Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. Patch Set 19: (19 comments) http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/rpc/negotiation-test.cc File src/kudu/rpc/negotiation-test.cc: PS19, Line 155: 192.169 since this is supposed to be testing a public IP, I think it's confusing that it looks so close to one of the private blocks. How about using 8.8.8.8 (a google DNS server that a lot of people might have memorized) PS19, Line 220: unique_ptr server_socket = desc.use_test_socket ? : unique_ptr(new NegotiationTestSocket()) : : unique_ptr(new Socket()); would be shorter to just write server_socket(desc.use_test_socket ? new NegotiationTestSocket() : new Socket()) right? Line 923: } how about a case with an authenticated connection from a public routable IP? http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/rpc/server_negotiation.cc File src/kudu/rpc/server_negotiation.cc: Line 61: DEFINE_bool(allow_unauthenticated_public_connections, false, Is this option actually still relevant if we have trusted_subnets? Isn't setting this to 'true' equivalent to adding 0.0.0.0/0 as a trusted subnet? PS19, Line 62: "Allow connections from unauthenticated public routable IPs. If enabled, " : "any people on the internet can connect and browse the data, which is " : "very dangerous. It is highly recommended to keep it disabled. I'd suggest rephrasing this as: Allow unauthenticated connections from public routable IP addresses. If this is enabled and network access is not otherwise restricted by a firewall, malicious users may be able to gain unauthorized access. It is highly recommended to keep this option disabled. PS19, Line 157: encryption_ == RpcEncryption::DISABLED) { not quite sure whether this is right. I would have guessed that this would be based on whether encryption is negotiated, not based on the server side config. PS19, Line 879: unauthenticated this doesn't quite match the logic, if we're also checking encryption. ie the client could be authenticated, but have disabled encryption, and still reach this error, right? http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/security/init.cc File src/kudu/security/init.cc: PS19, Line 500: vector addrs; : RETURN_NOT_OK(GetNetworkAddresses(&addrs)); : string val = StrCat(FLAGS_trusted_subnets, JoinStrings(addrs, ",")); : SetCommandLineOptionWithMode("trusted_subnets", :val.c_str(), :google::SET_FLAGS_DEFAULT); it strikes me as a little strange to be writing back into the flags rather than having some global variable which contains the parsed info. Per the suggestion elsewhere, if we had a structured way of storing the subnets, then we could have a global with pre-parsed data, which is populated from this initialization code http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/security/init.h File src/kudu/security/init.h: Line 34: Status InitTrustedSubnets(); I think it would be better to encapsulate all of the logic somewhere in netutil, and use something like std::once_call to do the initialization. That way we don't need to explicitly init it, and all the related code could be in one place http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/util/net/net_util-test.cc File src/kudu/util/net/net_util-test.cc: Line 99: EXPECT_TRUE(addr.WithInSubnet("10.0.0.0/8")); in these tests it seems like you're always testing the case where the subnet address ends in enough 0 bits that any non-masked bits are 0. I'm not sure what the definition of a valid CIDR address is -- is it legal to refer to a network like "1.2.3.4/8"? Or is it only legal to parse one like "1.0.0.0/8"? We should respect whatever the actual spec says. http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: PS19, Line 229: ==A nit: spaces around ==s Line 236: freeifaddrs(ifap); can you use MakeScopedCleanup here so that the freeifaddrs is placed as close as possible to the getifaddrs() call? that way if we added some early-return in the loop, we wouldn't leak. Line 333: // Strip any whitespace from the cidr_addr. should the stripping behavior be part of the ParseString below? Line 340: return (s.ok() && success); should verify that bits <= 32 http://gerrit.cloudera.org:8080/#/c/6514/19/src/kudu/util/net/net_util.h File src/kudu/util/net/net_util.h: PS19, Line 110: // Returns the local network addresses. : Status GetNetworkAddresses(std::vector* addresses); The docs should be clearer here t
[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs
Mike Percy has posted comments on this change. Change subject: Refactor ConsensusStatePB to hold committed and pending configs .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/consensus/metadata.proto File src/kudu/consensus/metadata.proto: PS2, Line 113: leader_uuid I think this is a little confusing. Can this just be the current leader, and if it's not part of the committed config then the master will have to figure that out? PS2, Line 117: config Rename to committed_config; Protobuf allows you to rename things as long as the position doesn't change. PS2, Line 121: pending_leader_uuid I'm not sure why this needs to be specified separately. I think we should remove this field. http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: PS2, Line 131: initial_committed_cstate_ Rename to initial_cstate_ http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: PS2, Line 951: committed_consensus_state Rename to consensus_state -- To view, visit http://gerrit.cloudera.org:8080/6809 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [build] disable tcmalloc for OS X builds
Alexey Serbin has posted comments on this change. Change subject: [build] disable tcmalloc for OS X builds .. Patch Set 1: Verified+1 Unrelated flake of MasterStressTest -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] [build] disable tcmalloc for OS X builds
Alexey Serbin has submitted this change and it was merged. Change subject: [build] disable tcmalloc for OS X builds .. [build] disable tcmalloc for OS X builds As a workaround for KUDU-1998, tcmalloc is disabled for OS X builds. Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Reviewed-on: http://gerrit.cloudera.org:8080/6821 Reviewed-by: Will Berkeley Reviewed-by: Adar Dembo Tested-by: Alexey Serbin --- M CMakeLists.txt 1 file changed, 3 insertions(+), 1 deletion(-) Approvals: Adar Dembo: Looks good to me, approved Will Berkeley: Looks good to me, but someone else must approve Alexey Serbin: Verified -- To view, visit http://gerrit.cloudera.org:8080/6821 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I005b34a6ead1a17a2fde6e5d873c99fc0d003308 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Todd Lipcon has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 15: (13 comments) http://gerrit.cloudera.org:8080/#/c/6630/15/docs/design-docs/cfile.md File docs/design-docs/cfile.md: Line 31: : 32-bit unsigned integer length delimiter specify whether this includes the checksum or just the following PB Line 33: : An optional Crc32 checksum of the entire header does this include the magic? Line 43: : An optional Crc32 checksum of the entire footer same - including the footer, magic, and pb len? Line 211: When checksums for the header, data, and footer are written in the CFile nit: add comma at end of line http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile-test.cc File src/kudu/cfile/cfile-test.cc: Line 223: opts.storage_attributes.compression = compression; oh, that's interesting, we forgot to actually use the compression here? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 141: // Parse Footer first to find unsupported features. does this mean that an old server will have problems with checksummed headers, because it will fail while reading the header rather than seeing the incompatible feature flag in the footer first? Line 146: if (PREDICT_FALSE(footer_->incompatible_features() > IncompatibleFeatures::ALL)) { '>' strikes me as strange here vs doing & ~IncompatibleFeatures::ALL PS15, Line 150: ALL instead of ALL, maybe a better name would be 'SUPPORTED'? Line 239: if (footer_size >= file_size_ - kMagicAndLengthSize) { do we want to have some kind of max size here? am afraid a flipped bit in a large block would cause a huge stack alloc below. alernatively we could use faststring for the buffer below, which is still stack-allocated for small sizes but avoids blowing out the stack in the case of a corruption PS15, Line 416: uint32_t checksum_size = sizeof(uint32_t); you have this in a few places. Maybe better to just have a global constant kChecksumSize? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: PS15, Line 230: = maybe |= here so it doesn't need to change when we add some other feature PS15, Line 500: checksum data checksum typo? http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/util/crc-test.cc File src/kudu/util/crc-test.cc: Line 50: const uint64_t expected_crc = 0xa9421b7; // Known value from crcutil usage test program. nit: kExpectedCrc -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Todd Lipcon has posted comments on this change. Change subject: KUDU-1952 Remove round-robin for block placement .. Patch Set 17: (18 comments) http://gerrit.cloudera.org:8080/#/c/6636/17//COMMIT_MSG Commit Message: PS17, Line 9: mitigate the : single-disk failure nit: we aren't mitigating the failure, but rather mitigating the effects of a single-disk failure PS17, Line 27: When loading : tablet data from a previous version of Kudu, the tablet's superblock : will not have a DataDirGroup. One will be generated containing all : data directories, as the tablet's data may already be spread across : any number of disks. what happens if we add a table on the new version, then downgrade, run for a bit, and then upgrade? the new version will still think that the tablet has blocks only on a subset of disks, even though in fact it has blocks on more, right? any way to avoid this issue? http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: Line 133: int CountFiles(const string& path) { would using Env::Walk() make this easier? http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS17, Line 62: evolving maybe even experimental, considering this isn't quite usable yet? Line 188: FLAGS_fs_data_dirs_full_disk_cache_seconds = 0; // Don't cache device fullness. ?? PS17, Line 410: Tablet already being tracked i think this and the one below could be a bit more specific and refer to datadir groups. Remember that Statuses don't keep track of the line of code or file where they were produced, so being pretty specific is helpful. PS17, Line 422: group_target_size = FLAGS_fs_target_data_dirs_per_tablet; : if (group_target_size > data_dirs_.size()) { : group_target_size = data_dirs_.size(); : } use std::min? Line 460: // This should only be reached by some tests; in cases where there is no maybe DCHECK(IsGTest()) from test_util_prod.h? Line 535: } why not also just skip adding the full ones here? might make the lower part easier http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS17, Line 53: PathSetPB maybe missed something somewhere in the design, but when we get to removing and replacing disks, what happens to the PathSetPB? We'll have to keep all of the dead UUIDs in there forever, right? PS17, Line 66: uuid_indices std::move Line 70: DCHECK(pb != nullptr); can just DCHECK(pb); Line 73: *group.mutable_uuid_indices()->Add() = uuid_idx; I think you can write this as group->add_uuid_indices(uuid_idx) Line 75: *pb = group; nit: pb->Swap(group); to avoid an extra allocation PS17, Line 251: no_dirs_found' is set to true if the group is already : // larger than the limit or if all candidates are full why not return a bad status instead of a second out-param? http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/file_block_manager.h File src/kudu/fs/file_block_manager.h: PS17, Line 67: CreateBlockOptions typo? http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/fs/fs.proto File src/kudu/fs/fs.proto: Line 116: message DataDirGroupPB { can you note where this ends up stored? http://gerrit.cloudera.org:8080/#/c/6636/17/src/kudu/tablet/delta_compaction.cc File src/kudu/tablet/delta_compaction.cc: Line 65: const string& tablet_id) nit: pass by value and then std::move below -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1999: Spark connector should login with Kerberos credentials on driver
Hello Jean-Daniel Cryans, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6822 to review the following change. Change subject: KUDU-1999: Spark connector should login with Kerberos credentials on driver .. KUDU-1999: Spark connector should login with Kerberos credentials on driver note: long-running jobs will still fail, since the credentials aren't refreshed. Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 --- M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala 4 files changed, 64 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/6822/1 -- To view, visit http://gerrit.cloudera.org:8080/6822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs
Todd Lipcon has posted comments on this change. Change subject: Refactor ConsensusStatePB to hold committed and pending configs .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/6809/2//COMMIT_MSG Commit Message: PS2, Line 11: pending leader not sure what the pending leader is. Configs themselves don't have leaders. Does this mean "the node who I voted for in the current term"? http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/consensus/metadata.proto File src/kudu/consensus/metadata.proto: PS2, Line 117: config > Rename to committed_config; Protobuf allows you to rename things as long as +1 PS2, Line 121: pending_leader_uuid > I'm not sure why this needs to be specified separately. I think we should r yea, I am a bit confused about this -- a leader is associated with a term, rather than a configuration, so there isn't really any concept of a 'pending' leader (see my comment elsewhere) http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: PS2, Line 109: string leader_uuid = cstate.has_pending_config() ? :cstate.pending_leader_uuid() : cstate.leader_uuid(); this part is still odd to me PS2, Line 111: RaftConfigPB config can be const ref PS2, Line 190: UNCOMMITTED_QUORUM we should really rename these constants to COMMITTED_CONFIG and PENDING_CONFIG http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/integration-tests/cluster_itest_util.cc File src/kudu/integration-tests/cluster_itest_util.cc: Line 484: // NB: FIX: wants active, not committed yea, I think this should be fixed before committing? or at least verified to not be an issue (and removed) http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: PS2, Line 131: initial_committed_cstate_ > Rename to initial_cstate_ is there anything we have to worry about with a copy going on at the same time as a config change? I dont quite remember what we do with this initial_cstate_ variable but making me nervous that we might now be copying one more bit of state than we used to, or somesuch? -- To view, visit http://gerrit.cloudera.org:8080/6809 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] KUDU-1999: Spark connector should login with Kerberos credentials on driver
Dan Burkert has posted comments on this change. Change subject: KUDU-1999: Spark connector should login with Kerberos credentials on driver .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6822/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: Line 80: "debug" -> sc.getConf.getBoolean("kudu.jaas.debug", defaultValue = false).toString Apparently Spark filters out non-spark configurations, so this is not settable. Should probably be switched to an environment variable. -- To view, visit http://gerrit.cloudera.org:8080/6822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1999: Spark connector should login with Kerberos credentials on driver
Dan Burkert has posted comments on this change. Change subject: KUDU-1999: Spark connector should login with Kerberos credentials on driver .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6822/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: Line 80: "debug" -> sc.getConf.getBoolean("kudu.jaas.debug", defaultValue = false).toString > Apparently Spark filters out non-spark configurations, so this is not setta Removed, since I think this can be done globally via a JVM option anyway. -- To view, visit http://gerrit.cloudera.org:8080/6822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1999: Spark connector should login with Kerberos credentials on driver
Todd Lipcon has posted comments on this change. Change subject: KUDU-1999: Spark connector should login with Kerberos credentials on driver .. Patch Set 1: (5 comments) Can you add some color to the commit message on testing? Clearly our unit test environment isn't well equipped to test this right now, but would be good to leave some notes about how to test it http://gerrit.cloudera.org:8080/#/c/6822/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: Line 49: class KuduContext(kuduMaster: String, @transient sc: SparkContext) extends Serializable { is there any way to magically get a SparkContext from a singleton or something? If not, we'll need to document this in release notes and also update developing.adoc, where we show an example of creating a KuduContext by hand. Perhaps a compatibility path where we can handle a null spark context is in order? Even though we say 'Unstable' above, we did go and document it, so I'd feel bad breaking it. PS1, Line 59: loginUserFromKeytab maybe rename to loginUser or getSubject or something, since many of the return paths don't login from keytab? Line 62: val principal = sc.getConf.getOption("spark.yarn.principal").getOrElse(return subject) is principal always required? or is there some 'default' principal that would be applied? PS1, Line 65: if (subject != null && !subject.getPrincipals(classOf[KerberosPrincipal]).isEmpty) { : return subject : } I'm wondering if this is always a good idea. If the user specifies a keytab, maybe we should always respect it? What if the one that is already in the Subject is an unrelated principal (eg from the wrong realm, etc)? PS1, Line 295: private object KuduContext { : val Log = LoggerFactory.getLogger(classOf[KuduContext]) : } is this Scala's equivalent of a static member? -- To view, visit http://gerrit.cloudera.org:8080/6822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1999: Spark connector should login with Kerberos credentials on driver
Hello Jean-Daniel Cryans, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6822 to look at the new patch set (#2). Change subject: KUDU-1999: Spark connector should login with Kerberos credentials on driver .. KUDU-1999: Spark connector should login with Kerberos credentials on driver Tested on a secure cluster, however I don't think an automated unit test can be written for this functionality. note: long-running jobs will still fail, since the credentials aren't refreshed. Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 --- M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala 4 files changed, 74 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/6822/2 -- To view, visit http://gerrit.cloudera.org:8080/6822 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If87a470c1cf99ea52668f22b72f1f7331877ec63 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs
Mike Percy has posted comments on this change. Change subject: Refactor ConsensusStatePB to hold committed and pending configs .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/consensus/consensus_meta.cc File src/kudu/consensus/consensus_meta.cc: PS2, Line 181: committed_cstate rename to cstate http://gerrit.cloudera.org:8080/#/c/6809/2/src/kudu/tserver/tablet_copy_source_session.cc File src/kudu/tserver/tablet_copy_source_session.cc: PS2, Line 131: initial_committed_cstate_ > is there anything we have to worry about with a copy going on at the same t It's fine because it just uses the committed consensus state from the copy source (at the time the tablet copy session was initiated) and sends it to the tablet copy target replica so it can have something to start with. Then the target replays the WAL so it will get whatever it needs from that. The code that uses it is in TabletCopyClient::WriteConsensusMeta() which calls calls cmeta_->MergeCommittedConsensusStatePB(*remote_committed_cstate_) to update its committed config and term from the remote[ We should also rename remote_committed_cstate_ to remote_cstate_ in TabletCopyClient. We may want to just egrep the code base for "committed.*cstate" and rename those variables to remove the committed part. -- To view, visit http://gerrit.cloudera.org:8080/6809 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] WIP: KUDU-463. Add checksumming to cfile
Grant Henke has posted comments on this change. Change subject: WIP: KUDU-463. Add checksumming to cfile .. Patch Set 15: (14 comments) http://gerrit.cloudera.org:8080/#/c/6630/15/docs/design-docs/cfile.md File docs/design-docs/cfile.md: Line 31: : 32-bit unsigned integer length delimiter > specify whether this includes the checksum or just the following PB Done Line 33: : An optional Crc32 checksum of the entire header > does this include the magic? Done Line 43: : An optional Crc32 checksum of the entire footer > same - including the footer, magic, and pb len? Done Line 211: When checksums for the header, data, and footer are written in the CFile > nit: add comma at end of line Done http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile-test.cc File src/kudu/cfile/cfile-test.cc: Line 223: opts.storage_attributes.compression = compression; > oh, that's interesting, we forgot to actually use the compression here? I think so. I noticed it while doing my testing. Line 336: uint8_t orig = data.mutable_data()[corrupt_offset]; > Nit: could do data.data()[corrupt_offset] here I think (though it may mean Done http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: Line 141: // Parse Footer first to find unsupported features. > does this mean that an old server will have problems with checksummed heade It shouldn't because the length doesn't include the checksum. Line 146: if (PREDICT_FALSE(footer_->incompatible_features() > IncompatibleFeatures::ALL)) { > '>' strikes me as strange here vs doing & ~IncompatibleFeatures::ALL I put a comment on a similar older review and wasn't sure if I should change it. Here was the response: > Your check is technically more correct since if there are bits in the middle > that are not valid but are set, it will check that. I assumed incompatible > feature bits would be used sequentially. Especially since we output > "supported" as an integer in the log message below. Line 150: IncompatibleFeatures::ALL)); > instead of ALL, maybe a better name would be 'SUPPORTED'? Done Line 239: if (footer_size >= file_size_ - kMagicAndLengthSize) { > do we want to have some kind of max size here? am afraid a flipped bit in a ParseMagicAndLength has a max size check: if (len > kMaxHeaderFooterPBSize) { return Status::Corruption("invalid data size"); } Line 416: uint32_t checksum_size = sizeof(uint32_t); > you have this in a few places. Maybe better to just have a global constant Done http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: Line 230: incompatible_features = IncompatibleFeatures::CHECKSUM; > maybe |= here so it doesn't need to change when we add some other feature Done Line 500: // Calculate and append a checksum data checksum. > typo? Done http://gerrit.cloudera.org:8080/#/c/6630/15/src/kudu/util/crc-test.cc File src/kudu/util/crc-test.cc: Line 50: const uint64_t expected_crc = 0xa9421b7; // Known value from crcutil usage test program. > nit: kExpectedCrc Done -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-463. Add checksumming to cfile
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6630 to look at the new patch set (#16). Change subject: KUDU-463. Add checksumming to cfile .. KUDU-463. Add checksumming to cfile Adds optional checksumming and validation to cfile blocks. Introduces 2 flags to control cfile checksumming: - cfile_write_checksums (default false) - cfile_verify_checksums (default true) cfile_write_checksums is used in the CFileWriter to enable computing and appending Crc32 checksums to the end of each cfile block, header and footer cfile_write_checksums is defaulted to false to ensure upgrades don't immediately result in performance degredation and incompatible data on downgrade. It can and likely should be defaulted to true in a later release. When cfile_write_checksums is set to true, the existing incompatible_features bitset in the cfile footer is used. A "checksum" bit is set to ensure a clear error message when older versions of Kudu try to read the file. If checksums are not written the incompatible_features "checksum" bit is not set. cfile_verify_checksums is used in the CFileReader to enable validating the data against the written checksum. cfile_verify_checksums is defaulted to true since validation only occurs if checksums are written. Any data that was written before checksumming was an option or when cfile_write_checksums was false will not be verified. Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 --- M docs/design-docs/cfile.md M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/cfile/cfile_util.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/util/crc-test.cc M src/kudu/util/crc.cc M src/kudu/util/crc.h 10 files changed, 325 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/6630/16 -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-463. Add checksumming to cfile
Grant Henke has posted comments on this change. Change subject: KUDU-463. Add checksumming to cfile .. Patch Set 16: I still want to performance test this yet too. -- To view, visit http://gerrit.cloudera.org:8080/6630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6756834cd7f27af258797a3654a95244abeb0976 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Grant Henke Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1549: compact LBM container metadata files at startup
Hello David Ribeiro Alves, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6826 to review the following change. Change subject: KUDU-1549: compact LBM container metadata files at startup .. KUDU-1549: compact LBM container metadata files at startup Here's another patch to speed up LBM startup by reducing the amount of metadata processed from disk. The idea is to find metadata files with lots of "dead" blocks (i.e. CREATE + DELETE pairs) and to "compact" them by rewriting the files without the dead blocks' records. The set of containers to compact is determined by the ratio of live blocks to total blocks, and there's a new gflag to configure that. To make this work, I had to adjust the accounting in BlockCreated() yet again. The new approach preserves next_block_offset_, but uses fs_aligned_length() directly for byte-related stats. Without this change, the aligned container stats (total_bytes and live_bytes_aligned) are completely out of whack in a container whose metadata was compacted. Like the dead container deletion patch, this one is quick and dirty in that the new logic is unsuitable for real-time compaction. Also like that patch, compaction in real-time would require non-trivial synchronization changes. Testing is done in several places: - BlockManagerStressTest: a "real" workload that'll compact some containers with additional inconsistencies injected by the LBMCorruptor. - BlockManagerTest.TestMetadataOkayDespiteFailedWrites: filesystem errors are injected into metadata compaction. - LogBlockManagerTest.TestCompactFullContainerMetadataAtStartup: explicit test for metadata compaction (and for the stat accounting fix). Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 --- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/util/env_posix.cc M src/kudu/util/status.h 7 files changed, 282 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/26/6826/1 -- To view, visit http://gerrit.cloudera.org:8080/6826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1549: delete dead LBM containers at startup
Hello David Ribeiro Alves, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6824 to review the following change. Change subject: KUDU-1549: delete dead LBM containers at startup .. KUDU-1549: delete dead LBM containers at startup Full containers with no live blocks are "dead" in that they will never be used for any block operations. These containers are safe to delete, and though we lose some forensic information in doing so, we'll wind up processing less container metadata in the next startup. This patch adds a quick and dirty implementation of dead container deletion at startup. It would be nice to also do it in real time (perhaps as a maintenance manager operation), but that requires containers to have shared ownership, a lifecycle change I'm not keen on making right now. Change-Id: I73a903092cf89508b7ba97fde3a2d6e691161b4c --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h 3 files changed, 156 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/24/6824/1 -- To view, visit http://gerrit.cloudera.org:8080/6824 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I73a903092cf89508b7ba97fde3a2d6e691161b4c Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log block manager: reopen container metadata writers at the OS level
Hello David Ribeiro Alves, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6825 to review the following change. Change subject: log block manager: reopen container metadata writers at the OS level .. log block manager: reopen container metadata writers at the OS level Another patch I'm working on compacts LBM metadata files at startup by writing compacted metadata into a temporary file, then overwriting the existing metadata file with it. For this to work properly, "reopening" a metadata file should actually reopen the file on the filesystem. This patch does just that. There should be no impact on any existing code; the reopening done after truncating a partial record from a metadata file is just as happy if done at the logical level (before) as at the physical level (now). Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394 --- M src/kudu/fs/log_block_manager-test-util.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/util/pb_util-test.cc M src/kudu/util/pb_util.cc M src/kudu/util/pb_util.h 6 files changed, 50 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/6825/1 -- To view, visit http://gerrit.cloudera.org:8080/6825 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log block manager: convert container metrics from counters to gauges
Hello David Ribeiro Alves, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6823 to review the following change. Change subject: log block manager: convert container metrics from counters to gauges .. log block manager: convert container metrics from counters to gauges Container deletion is just around the corner, so these metrics need to become gauges so that they can be decremented. As an aside, it's possible that we've already been mishandling these counters: they should never drop in value, even across restarts, and we were incrementing them every time a container was reopened at startup. However, we may have been OK because the block manager loads before the web UI, so by the time the metrics are fetchable, the counter values should be the same as what they were pre-restart. Change-Id: I343d99d42da7f4fbc0facb476c97f2ca8785031d --- M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc 2 files changed, 18 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/6823/1 -- To view, visit http://gerrit.cloudera.org:8080/6823 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I343d99d42da7f4fbc0facb476c97f2ca8785031d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Todd Lipcon